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POLYKETIDE SYNTHASE ENZYMES AND RECOMBINANT DNA 
CONSTRUCTS THEREFOR 

Cross-Reference to Related Applications 
The present application claims priority to related U.S. patent application Serial 
Nos. 60/102,748, filed 2 Oct. 1998; 60/139,650, filed 17 June 1999; and 60/123,810, filed 
1 1 Mar. 1999, each of which is incorporated herein by reference. 

Field of the Invention 
The present invention relates to polyketides and the polyketide synthase (PKS) 
enzymes that produce them. The invention also relates generally to genes encoding PKS 
enzymes and to recombinant host cells containing such genes and in which expression of 
such genes leads to the production of polyketides. The present invention also relates to 
compounds useful as medicaments having immunosuppressive and/or neurotrophic 
activity. Thus, the invention relates to the fields of chemistry, molecular biology, and 
agricultural, medical, and veterinary technology. 

Background of the Invention 
Polyketides are a class of compounds synthesized from 2-carbon units through a 
series of condensations and subsequent modifications. Polyketides occur in many types of 
organisms, including fungi and mycelial bacteria, in particular, the actinomycetes. 
Polyketides are biologically active molecules with a wide variety of structures, and the 
class encompasses numerous compounds with diverse activities. Tetracycline, 
erythromycin, epothilone, FK-506, FK-520, narbomycin, picromycin, rapamycin, 
spinocyn, and tylosin are examples of polyketides. Given the difficulty in producing 
polyketide compounds by traditional chemical methodology, and the typically low 
production of polyketides in wild-type cells, there has been considerable interest in 
finding improved or alternate means to produce polyketide compounds. 
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This interest has resulted in the cloning, analysis, and manipulation by 
recombinant DNA technology of genes that encode PKS enzymes. The resulting 
technology allows one to manipulate a known PKS gene cluster either to produce the 
polyketide synthesized by that PKS at higher levels than occur in nature or in hosts that 
otherwise do not produce the polyketide. The technology also allows one to produce 
molecules that are structurally related to, but distinct from, the polyketides produced from 
known PKS gene clusters. See, e.g., PCT publication Nos. WO 93/13663; 95/08548; 
96/40968; 97/02358; 98/27203; and 98/49315; United States Patent Nos. 4,874,748; 
5,063,155; 5,098,837; 5,149,639; 5,672,491; 5,712,146; 5,830,750; and 5,843,718; and 
Fu et al. 9 1994, Biochemistry 33: 9321-9326; McDaniel et al 9 1993, Science 262: 1546- 
1550; and Rohr, 1995, Angew. Chem. Int. Ed Engl. 34(B): 881-888, each of which is 
incorporated herein by reference. 

Polyketides are synthesized in nature by.PKS enzymes. These enzymes, which are 
complexes of multiple large proteins, are similar to the synthases that catalyze 
condensation of 2-carbon units in the biosynthesis of fatty acids. PKSs catalyze the 
biosynthesis of polyketides through repeated, decarboxylase Claisen condensations 
between acylthioester building blocks. The building blocks used to form complex 
polyketides are typically acylthioesters, such as acetyl, butyryl, propionyl, malonyl, 
hydroxymalonyl, methylmalonyl, and ethylmalonyl CoA. Other building blocks include 
amino acid like acylthioesters. PKS enzymes that incorporate such building blocks 
include an activity that functions as an amino acid ligase (an AMP ligase) or as a non- 
ribosomal peptide synthetase (NRPS). Two major types of PKS enzymes are known; 
these differ in their composition and mode of synthesis of the polyketide synthesized. 
These two major types of PKS enzymes are commonly referred to as Type I or "modular" 
and Type II "iterative" PKS enzymes. 

In the Type I or modular PKS enzyme group, a set of separate catalytic active 
sites (each active site is termed a "domain", and a set thereof is termed a "module") exists 
for each cycle of carbon chain elongation and modification in the polyketide synthesis 
pathway. The typical modular PKS is composed of several large polypeptides, which can 
be segregated from amino to carboxy termini into a loading module, multiple extender 
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modules, and a releasing (or thioesterase) domain. The PKS enzyme known as 6- 
deoxyerythronolide B synthase (DEBS) is a Type I PKS. In DEBS, there is a loading 
module, six extender modules, and a thioesterase (TE) domain. The loading module, six 
extender modules, and TE of DEBS are present on three separate proteins (designated 
DEBS-1, DEBS-2, and DEBS-3, with two extender modules per protein). Each of the 
DEBS polypeptides is encoded by a separate open reading frame (ORF) or gene; these 
genes are known as eryAI, eryAII, and eryAIII. See Caffrey et al., 1992, FEBS Letters 
304: 205, and U.S. Patent No. 5,824,513, each of which is incorporated herein by 



Generally, the loading module is responsible for binding the first building block 
used to synthesize the polyketide and transferring it to the first extender module. The 
loading module of DEBS consists of an acy transferase (AT) domain and an acyl carrier 
protein (ACP) domain. Another type of loading module utilizes an inactivated 
ketosynthase (KS) domain and AT and ACP domains. This inactivated KS is in some 
instances called KS Q , where the superscript letter is the abbreviation for the amino acid, 
glutamine, that is present instead of the active site cysteine required for ketosynthase 
activity. In other PKS enzymes, including the FK-506 PKS, the loading module 
incorporates an unusual starter unit and is composed of a CoA ligase like activity domain. 
In any event, the loading module recognizes a particular acyl-CoA (usually acetyl or 
propionyl but sometimes butyryl or other acyl-CoA) and transfers it as a thiol ester to the 
ACP of the loading module. 

The AT on each of the extender modules recognizes a particular extender-CoA 
(malonyl or alpha-substituted malonyl, i.e., methylmalonyl, ethylmalonyl, and 2- 
hydroxymalonyl) and transfers it to the ACP of that extender module to form a thioester. 
Each extender module is responsible for accepting a compound from a prior module, 
binding a building block, attaching the building block to the compound from the prior 
module, optionally performing one or more additional functions, and transferring the 
resulting compound to the next module. 

Each extender module of a modular PKS contains a KS, AT, ACP, and zero, one, 
two, or three domains that modify the beta-carbon of the growing polyketide chain. A 
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typical (non-loading) minimal Type I PKS extender module is exemplified by extender 
module three of DEBS, which contains a KS domain, an AT domain, and an ACP 
domain. These three domains are sufficient to activate a 2-carbon extender unit and attach 
it to the growing polyketide molecule. The next extender module, in turn, is responsible 
for attaching the next building block and transferring the growing compound to the next 
extender module until synthesis is complete. 

Once the PKS is primed with acyl- and malonyl-ACPs, the acyl group of the 
loading module is transferred to form a thiol ester (trans-esterification) at the KS of the 
first extender module; at this stage, extender module one possesses an acyl-KS and a 
malonyl (or substituted malonyl) ACP. The acyl group derived from the loading module 
is then covalently attached to the alpha-carbon of the malonyl group to form a carbon- 
carbon bond, driven by concomitant decarboxylation, and generating a new acyl- ACP 
that has a backbone two carbons longer than the loading building block (elongation or 
extension). 

The polyketide chain, growing by two carbons each extender module, is 
sequentially passed as covalently bound thiol esters from extender module to extender 
module, in an assembly line-like process. The carbon chain produced by this process 
alone would possess a ketone at every other carbon atom, producing a polyketone, from 
which the name polyketide arises. Most commonly, however, additional enzymatic 
activities modify the beta keto group of each two carbon unit just after it has been added 
to the growing polyketide chain but before it is transferred to the next module. 

Thus, in addition to the minimal module containing KS, AT, and ACP domains 
necessary to form the carbon-carbon bond, and as noted above, other domains that 
modify the beta-carbonyl moiety can be present. Thus, modules may contain a 
ketoreductase (KR) domain that reduces the keto group to an alcohol. Modules may also 
contain a KR domain plus a dehydratase (DH) domain that dehydrates the alcohol to a 
double bond. Modules may also contain a KR domain, a DH domain, and an 
enoylreductase (ER) domain that converts the double bond product to a saturated single 
bond using the beta carbon as a methylene function. An extender module can also contain 
other enzymatic activities, such as, for example, a methylase or dimethylase activity. 
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After traversing the final extender module, the polyketide encounters a releasing 
domain that cleaves the polyketide from the PKS and typically cyclizes the polyketide. 
For example, final synthesis of 6-dEB is regulated by a TE domain located at the end of 
extender module six. In the synthesis of 6-dEB, the TE domain catalyzes cyclization of 
5 the macrolide ring by formation of an ester linkage. In FK-506, FK-520, rapamycin, and 
similar polyketides, the TE activity is replaced by a RapP (for rapamycin) or RapP like 
activity that makes a linkage incorporating a pipecolate acid residue. The enzymatic 
activity that catalyzes this incorporation for the rapamycin enzyme is known as RapP, 
encoded by the rapP gene. The polyketide can be modified further by tailoring enzymes; 
10 these enzymes add carbohydrate groups or methyl groups, or make other modifications, 
i.e., oxidation or reduction, on the polyketide core molecule. For example, 6-dEB is 
hydroxylated at C-6 and C- 12 and glycosylated at C-3 and C-5 in the synthesis of 
erythromycin A. 

In Type I PKS polypeptides, the order of catalytic domains is conserved. When all 

15 beta-keto processing domains are present in a module, the order of domains in that 

module from N-to-C-terminus is always KS, AT, DH, ER, KR, and ACP. Some or all of 
the beta-keto processing domains may be missing in particular modules, but the order of 
the domains present in a module remains the same. The order of domains within modules 
is believed to be important for proper folding of the PKS polypetides into an active 

20 complex. Importantly, there is considerable flexibility in PKS enzymes, which allows for 
the genetic engineering of novel catalytic complexes. The engineering of these enzymes 
is achieved by modifying, adding, or deleting domains, or replacing them with those 
taken from other Type I PKS enzymes. It is also achieved by deleting, replacing, or 
adding entire modules with those taken from other sources. A genetically engineered 

25 PKS complex should of course have the ability to catalyze the synthesis of the product 
predicted from the genetic alterations made. 

Alignments of the many available amino acid sequences for Type I PKS enzymes 
has approximately defined the boundaries of the various catalytic domains. Sequence 
alignments also have revealed linker regions between the catalytic domains and at the N- 

30 and C-termini of individual polypeptides. The sequences of these linker regions are less 
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well conserved than are those for the catalytic domains, which is in part how linker 
regions are identified. Linker regions can be important for proper association between 
domains and between the individual polypeptides that comprise the PKS complex. One 
can thus view the linkers and domains together as creating a scaffold on which the 
domains and modules are positioned in the correct orientation to be active. This 
organization and positioning, if retained, permits PKS domains of different or identical 
substrate specificities to be substituted (usually at the DNA level) between PKS enzymes 
by various available methodologies. In selecting the boundaries of, for example, an AT 
replacement, one can thus make the replacement so as to retain the linkers of the recipient 
PKS or to replace them with the linkers of the donor PKS AT domain, or, preferably, 
make both constructs to ensure that the correct linker regions between the KS and AT 
domains have been included in at least one of the engineered enzymes. Thus, there is 
considerable flexibility in the design of new PKS enzymes with the result that known 
polyketides can be produced more effectively, and novel polyketides useful as 
pharmaceuticals or for other purposes can be made. 

By appropriate application of recombinant DNA technology, a wide variety of 
polyketides can be prepared in a variety of different host cells provided one has access to 
nucleic acid compounds that encode PKS proteins and polyketide modification enzymes. 
The present invention helps meet the need for such nucleic acid compounds by providing 
recombinant vectors that encode the FK-520 PKS enzyme and various FK-520 
modification enzymes. Moreover, while the FK-506 and FK-520 polyketides have many 
useful activities, there remains a need for compounds with similar useful activities but 
with better pharmacokinetic profile and metabolism and fewer side-effects. The present 
invention helps meet the need for such compounds as well. 



In one embodiment, the present invention provides recombinant DNA vectors that 
encode all or part of the FK-520 PKS enzyme. Illustrative vectors of the invention 
include cosmidpKOS034-120, pKOS034-124, pKOS065-C31, pKOS065-C3, pKOS065- 
M27, and pKOS065-M21. The invention also provides nucleic acid compounds that 



Summary of the Invention 
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encode the various domains of the FK-520 PKS, i.e., the KS, AT, ACP, KR, DH, and ER 
domains. These compounds can be readily used, alone or in combination with nucleic 
acids encoding other FK-520 or non-FK-520 PKS domains, as intermediates in the 
construction of recombinant vectors that encode all or part of PKS enzymes that make 
5 novel polyketides. 

The invention also provides isolated nucleic acids that encode all or part of one or 
more modules of the FK-520 PKS, each module comprising a ketosynthase activity, an 
acyl transferase activity, and an acyl carrier protein activity. The invention provides an 
isolated nucleic acid that encodes one or more open reading frames of FK-520 PKS 

10 genes, said open reading frames comprising coding sequences for a CoA ligase activity, 
an NRPS activity, or two or more extender modules. The invention also provides 
recombinant expression vectors containing these nucleic acids. 

In another embodiment, the invention provides isolated nucleic acids that encode 
all or a part of a PKS that contains at least one module in which at least one of the 

1 5 domains in the module is a domain from a non-FK-520 PKS and at least one domain is 
from the FK-520 PKS. The non-FK-520 PKS domain or module originates from the 
rapamycin PKS, the FK-506 PKS, DEBS, or another PKS. The invention also provides 
recombinant expression vectors containing these nucleic acids. 



20 polyketide, said method comprising transforming a host cell with a recombinant DNA 
vector that encodes at least one module of a PKS, said module comprising at least one 
FK-520 PKS domain, and culturing said host cell under conditions such that said PKS is 
produced and catalyzes synthesis of said polyketide. In one aspect, the method is 
practiced with a Streptomyces host cell. In another aspect, the polyketide produced is FK- 

25 520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
520. In another aspect, the polyketide produced is a polyketide related in structure to FK- 
506 or rapamycin. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of ethylmalonyl CoA in a heterologous host cell. These genes 
30 and the methods of the invention enable one to create recombinant host cells with the 



In another embodiment, the invention provides a method of preparing a 
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ability to produce polyketides or other compounds that require ethylmalonyl CoA for 
biosynthesis. The invention also provides recombinant nucleic acids that encode AT 
domains specific for ethylmalonyl CoA. Thus, the compounds of the invention can be 
used to produce polyketides requiring ethylmalonyl CoA in host cells that otherwise are 
unable to produce such polyketides. 

In another embodiment, the invention provides a set of genes in recombinant form 
sufficient for the synthesis of 2-hydroxymaIony 1 CoA and 2-methoxymalonyl CoA in a 
heterologous host cell. These genes and the methods of the invention enable one to create 
recombinant host cells with the ability to produce polyketides or other compounds that 
require 2-hydroxymalonyl CoA for biosynthesis. The invention also provides 
recombinant nucleic acids that encode AT domains specific for 2-hydroxymalonyl CoA 
and 2-methoxymalonyl CoA. Thus, the compounds of the invention can be used to 
produce polyketides requiring 2-hydroxymalonyl CoA or 2-methoxymalonyl CoA in host 
cells that are otherwise unable to produce such polyketides. 

In another embodiment, the invention provides a compound related in structure to 
FK-520 or FK-506 that is useful in the treatment of a medical condition. These 
compounds include compounds in which the C-13 methoxy group is replaced by a moiety 
selected from the group consisting of hydrogen, methyl, and ethyl moieties. Such 
compounds are less susceptible to the main in vivo pathway of degradation for FK-520 
and FK-506 and related compounds and thus exhibit an improved pharmacokinetic 
profile. The compounds of the invention also include compounds in which the C-15 
methoxy group is replaced by a moiety selected from the group consisting of hydrogen, 
methyl, and ethyl moieties. The compounds of the invention also include the above 
compounds further modified by chemical methodology to produce derivatives such as, 
but not limited to, the C-18 hydroxy 1 derivatives, which have potent neurotrophin but not 
immunosuppresion activities. 



dc- 176500 




I PATENT 

AttyDkt: 300622002600 



-9- 



Thus, the invention provides polyketides having the structure: 




'OH 



,OMe 



wherein, Ri is hydrogen, methyl, ethyl, or allyl; R 2 is hydrogen or hydroxyl, provided 
that when R 2 is hydrogen, there is a double bond between C-20 and C-19; R 3 is hydrogen 
or hydroxyl; R4 is methoxyl, hydrogen, methyl, or ethyl; and R 5 is methoxyl, hydrogen, 
methyl, or ethyl; but not including FK-506, FK-520, 18-hydroxy-FK-520, and 18- 
hydroxy-FK-506. The invention provides these compounds in purified form and in 
pharmaceutical compositions. 

In another embodiment, the invention provides a method for treating a medical 
condition by administering a pharmaceutical^ efficacious dose of a compound of the 
invention. The compounds of the invention may be administered to achieve 
immunosuppression or to stimulate nerve growth and regeneration. 

These and other embodiments and aspects of the invention will be more fully 
understood after consideration of the attached Drawings and their brief description below, 
together with the detailed description, examples, and claims that follow. 



FigureU^ttows a diagram of the FK-520 biosynthetic gene cluster. The top line 
provides a scale in kilobase pairs (kb). The second line shows a restriction map with 
selected restriction enzyme recognition sequences indicated. K is Kpnl; X is Xhol, S is 
Sacl; P is Pstl; and E is EcoRl. The third line indicates the position of FK-520 PKS and 
related genes. Genes are abbreviated with a one letter designation, i.e., C is flcbC. 




Brief Description of the Drawings 
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Immediately under the third line are numbered segments showing where the loading 
module (L) and ten different extender modules (numbered 1 - 10) are encoded on the 
various genes shown. At the bottom of the Figure, the DNA inserts of various cosmids of 
the invention (i.e., 34-124 is cosmid pKOS034-124) are shown in alignment with the FK- 
5 520 biosynthetic gene cluster. 

Figure 2 shows the loading module (load), the ten extender modules, and the 
peptide synthetase domain of the FK-520 PKS, together with, on the top line, the genes 
that encode the various domains and modules. Also shown are the various intermediates 
in FK-520 biosynthesis, as well as the structure of FK-520, with carbons 13, 15, 21, and 
10 31 numbered. The various domains of each module and subdomains of the loading 

module are also shown. The darkened circles showing the DH domains in modules 2, 3, 
:jj and 4 indicate that the dehydratase domain is not functional as a dehydratase; this domain 

*\ may affect the stereochemistry at the corresponding position in the polyketide. The 

Ul substituents on the FK-520 structure that result from the action of non-PKS enzymes are 

f§ I; 15 also indicated by arrows, together with the types of enzymes or the genes that code for 

1 ,sfc the enzymes that mediate the action. Although the methyltransferase is shown acting at 

O the C- 1 3 and C- 1 5 hydroxy 1 groups after release of the polyketide from the PKS, the 

0 methyltransferase may act on the 2-hydroxymalonyl substrate prior to or 

contemporaneously with its incorporation during polyketide synthesis. 

□ 

20 Figure 3 shows a close-up view of the left end of the FK-520 gene cluster, which 

contains at least ten additional genes. The ethyl side chain on carbon 21 of FK-520 
(Figure 2) is derived from an ethylmalonyl CoA extender unit that is incorporated by an 
ethylmalonyl specific AT domain in extender module 4 of the PKS. At least four of the 
genes in this region code for enzymes involved in ethylmalonyl biosynthesis. The 

25 polyhydroxybutyrate depolymerase is involved in maintaining hydroxybutyryl-CoA 
pools during FK-520 production. Polyhydroxybutyrate accumulates during vegetative 
growth and disappears during stationary phase in other Streptomyces (Ranade and 
Vining, 1993, Can. J. Microbiol 39:377). Open reading frames with unknown function 
are indicated with a question mark. 
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Figure 4 shows a biosynthetic pathway for the biosynthesis of ethylmalonyl CoA 
from acetoacetyl CoA consistent with the function assigned to four of the genes in the 
FK-520 gene cluster shown in Figure 3. 

Figure 5 shows a close-up view of the right-end of the FK-520 PKS gene cluster 
(and of the sequences on cosmid pKOS065-C3 1). The genes shown include JkbD,fkbM 
(a methyl transferase that methylates the hydroxyl group on C-31 of FK-520), flcbN (a 
homolog of a gene described as a regulator of cholesterol oxidase and that is believed to 
be a transcriptional activator), flcbQ (a type II thioesterase, which can increase polyketide 
production levels), and flcbS (a crotonyl-CoA reductase involved in the biosynthesis of 
ethylmalonyl CoA). 

Figure 6 shows the proposed degradative pathway for tacrolimus (FK-506) 
metabolism. 

Figure 7 shows a schematic process for the construction of recombinant PKS 
genes of the invention that encode PKS enzymes that produce 13-desmethoxy FK-506 
and FK-520 polyketides of the invention, as described in Example 4, below. 

Figure 8, in Parts A and B, shows certain compounds of the invention preferred 
for dermal application in Part A and a synthetic route for making those compounds in 



Given the valuable pharmaceutical properties of polyketides, there is a need for 
methods and reagents for producing large quantities of polyketides, as well as for 
producing related compounds not found in nature. The present invention provides such 
methods and reagents, with particular application to methods and reagents for producing 
the polyketides known as FK-520, also known as ascomycin or L-683,590 (see Holt et 
al., 1993, JACS 775:9925), and FK-506, also known as tacrolimus. Tacrolimus is a 
macrolide immunosuppressant used to prevent or treat rejection of transplanted heart, 
kidney, liver, lung, pancreas, and small bowel allografts. The drug is also useful for the 
prevention and treatment of graft- versus-host disease in patients receiving bone marrow 
transplants, and for the treatment of severe, refractory uveitis. There have been additional 



PartB. 



Detailed Description of the Invention 
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reports of the unapproved use of tacrolimus for other conditions, including alopecia 
universalis, autoimmune chronic active hepatitis, inflammatory bowel disease, multiple 
sclerosis, primary biliary cirrhosis, and scleroderma. The invention provides methods and 
reagents for making novel polyketides related in structure to FK-520 and FK-506. and 
structurally related polyketides such as rapamycin. 

The FK-506 and rapamycin polyketides are potent immunosuppressants, with 
chemical structures shown below. 




FK-520 differs from FK-506 in that it lacks the allyl group at C-21 of FK-506, having 
instead an ethyl group at that position, and has similar activity to FK-506, albeit reduced 
immunosuppressive activity. 

These compounds act through initial formation of an intermediate complex with 
protein "immunophilins" known as FKBPs (FK-506 binding proteins), including FKBP- 
12. Immunophilins are a class of cytosolic proteins that form complexes with molecules 
such as FK-506, FK-520, and rapamycin that in turn serve as ligands for other cellular 
targets involved in signal transduction. Binding of FK-506, FK-520, and rapamycin to 
FKBP occurs through the structurally similar segments of the polyketide molecules, 
known as the "FKBP-binding domain" (as generally but not precisely indicated by the 
stippled regions in the structures above). The FK-506-FKBP complex then binds 
calcineurin, while the rapamycin-FKBP complex binds to a protein known as RAFT- 1 . 
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Binding of the FKBP-polyketide complex to these second proteins occurs through the 
dissimilar regions of the drugs known as the "effector" domains. 



The three component FKBP-polyketide-effector complex is required for signal 
transduction and subsequent immunosuppressive activity of FK-506, FK-520, and 
rapamycin. Modifications in the effector domains of FK-506, FK-520, and rapamycin 
that destroy binding to the effector proteins (calcineurin or RAFT) lead to loss of 
immunosuppressive activity, even though FKBP binding is unaffected. Further, such 
analogs antagonize the immunosuppressive effects of the parent polyketides, because 
they compete for FKBP. Such non-immunosuppressive analogs also show reduced 
toxicity (see Dumont et al, 1 992, Journal of Experimental Medicine 1 76, 75 1 -760), 
indicating that much of the toxicity of these drugs is not linked to FKBP binding. 

In addition to immunosuppressive activity, FK-520, FK-506, and rapamycin have 
neurotrophic activity. In the central nervous system and in peripheral nerves, 
immunophilins are referred to as "neuroimmunophilins". The neuroimmunophilin FKBP 
is markedly enriched in the central nervous system and in peripheral nerves. Molecules 
that bind to the neuroimmunophilin FKBP, such as FK-506 and FK-520, have the 
remarkable effect of stimulating nerve growth. In vitro, they act as neurotrophic, i.e., 
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they promote neurite outgrowth in NGF-treated PC 12 cells and in sensory neuronal 
cultures, and in intact animals, they promote regrowth of damaged facial and sciatic 
nerves, and repair lesioned serotonin and dopamine neurons in the brain. See Gold et al... 
Jun. 1999, J. Pharm. Exp. Ther. 289(3): 1202-1210; Lyons et aL 1994, Proc. National 
Academy of Science 91: 3191-3195; Gold et al., 1995, Journal of Neuroscience 15: 7509- 
7516; and Steiner et aL, 1997, Proc. National Academy of Science 94: 2019-2024. 
Further, the restored central and peripheral neurons appear to be functional. 

Compared to protein neurotrophic molecules (BNDF, NGF, etc.), the small- 
molecule neurotrophic such as FK-506, FK-520, and rapamycin have different, and 
often advantageous, properties. First, whereas protein neurotrophins are difficult to 
deliver to their intended site of action and may require intra-cranial injection, the small- 
molecule neurotrophins display excellent bioavailability; they are active when 
administered subcutaneously and orally. Second, whereas protein neurotrophins show 
quite specific effects, the small-molecule neurotrophins show rather broad effects. 
Finally, whereas protein neurotrophins often show effects on normal sensory nerves, the 
small-molecule neurotrophins do not induce aberrant sprouting of normal neuronal 
processes and seem to affect damaged nerves specifically. Neuroimmunophilin ligands 
have potential therapeutic utility in a variety of disorders involving nerve degeneration 
(e.g. multiple sclerosis, Parkinson's disease, Alzheimer's disease, stroke, traumatic spinal 
cord and brain injury, peripheral neuropathies). 

Recent studies have shown that the immunosuppressive and neurite outgrowth 
activity of FK-506, FK-520, and rapamycin can be separated; the neuroregenerative 
activity in the absence of immunosuppressive activity is retained by agents which bind to 
FKBP but not to the effector proteins calcineurin or RAFT. See Steiner et al., 1997, 
Nature Medicine 3: 421-428. 




Nerve Regeneration 
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Available structure-activity data show that the important features for neurotrophic 
activity of rapamycin, FK-520, and FK-506 lie within the common, contiguous segments 
of the macrolide ring that bind to FKBP. This portion of the molecule is termed the 
"FKBP binding domain" (see VanDuyne et al, 1993, Journal of Molecular Biology 229: 
105-124.). Nevertheless, the effector domains of the parent macrolides contribute to 
conformational rigidity of the binding domain and thus indirectly contribute to FKBP 
binding. 



There are a number of other reported analogs of FK-506, FK-520, and rapamycin that 
bind to FKBP but not the effector protein calcineurin or RAFT. These analogs show 
effects on nerve regeneration without immunosuppressive effects. 

Naturally occurring FK-520 and FK-506 analogs include the antascomycins, 
which are FK-506-like macrolides that lack the functional groups of FK-506 that bind to 
calcineurin (see Fehr et al. 9 1996, The Journal of Antibiotics 49: 230-233). These 
molecules bind FKBP as effectively as does FK-506; they antagonize the effects of both 
FK-506 and rapamycin, yet lack immunosuppressive activity. 




FKBP binding domain 1 
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Other analogs can be produced by chemically modifying FK-506, FK-520, or 
rapamycin. One approach to obtaining neuroimmunophilin ligands is to destroy the 
effector binding region of FK-506, FK-520, or rapamycin by chemical modification. 
5 While the chemical modifications permitted on the parent compounds are quite limited, 
some useful chemically modified analogs exist. The FK-520 analog L-685,818 (ED50 = 
0.7 nM for FKBP binding; see Dumont et al. 9 1992), and the rapamycin analog WAY- 
124,466 (IC 50 = 12.5 nM; see Ocain et ai, 1993, Biochemistry Biophysical Research 
Communications 192: 1340-134693) are about as effective as FK-506, FK-520, and 
1 0 rapamycin at promoting neurite outgrowth in sensory neurons (see Steiner et ah, 1 997). 




L-685,818 WAY-1 24,466 



One of the few positions of rapamycin that is readily amenable to chemical 
modification is the ally lie 16-methoxy group; this reactive group is readily exchanged by 
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acid-catalyzed nucleophilic substitution. Replacement of the 16-methoxy group of 
rapamycin with a variety of bulky groups has produced analogs showing selective loss of 
immunosuppressive activity while retaining FKBP-binding (see Luengo et al. 9 1995, 
Chemistry & Biology 2: 471-481). One of the best compounds, 1, below, shows complete 
loss of activity in the splenocyte proliferation assay with only a 10-fold reduction in 
binding to FKBP. 




There are also synthetic analogs of FKBP binding domains. These compounds 
reflect an approach to obtaining neuroimmunophilin ligands based on "rationally 
designed" molecules that retain the FKBP-binding region in an appropriate conformation 
for binding to FKBP, but do not possess the effector binding regions. In one example, the 
ends of the FKBP binding domain were tethered by hydrocarbon chains (see Holt et aL, 
1993, Journal of the American Chemical Society 115: 9925-9938); the best analog, 2, 
below, binds to FKBP about as well as FK-506. In a similar approach, the ends of the 
FKBP binding domain were tethered by a tripeptide to give analog 3, below, which binds 
to FKBP about 20-fold poorer than FK-506. These compounds are anticipated to have 
neuroimmunophilin binding activity. 
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2 



3 



In a primate MPTP model of Parkinson's disease, administration of FKBP ligand 
GPI-1046 caused brain cells to regenerate and behavioral measures to improve. MPTP is 
a neurotoxin, which, when administered to animals, selectively damages nigral-striatal 
dopamine neurons in the brain, mimicking the damage caused by Parkinson's disease. 
Whereas, before treatment, animals were unable to use affected limbs, the FKBP ligand 
restored the ability of animals to feed themselves and gave improvements in measures of 
locomotor activity, neurological outcome, and fine motor control. There were also 
corresponding increases in regrowth of damaged nerve terminals. These results 
demonstrate the utility of FKBP ligands for treatment of diseases of the CNS. 

From the above description, two general approaches towards the design of non- 
immunosuppressant, neuroimmunophilin ligands can be seen. The first involves the 
construction of constrained cyclic analogs of FK-506 in which the FKBP binding domain 
is fixed in a conformation optimal for binding to FKBP. The advantages of this approach 
are that the conformation of the analogs can be accurately modeled and predicted by 
computational methods, and the analogs closely resemble parent molecules that have 
proven pharmacological properties. A disadvantage is that the difficult chemistry limits 
the numbers and types of compounds that can be prepared. The second approach involves 
the trial and error construction of acyclic analogs of the FKBP binding domain by 
conventional medicinal chemistry. The advantages to this approach are that the chemistry 
is suitable for production of the numerous compounds needed for such interactive 
chemistry-bioassay approaches. The disadvantages are that the molecular types of 
compounds that have emerged have no known history of appropriate pharmacological 
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properties, have rather labile ester functional groups, and are too conformationally mobile 
to allow accurate prediction of conformational properties. 

The present invention provides useful methods and reagents related to the first 
approach, but with significant advantages. The invention provides recombinant PKS 
genes that produce a wide variety of polyketides that cannot otherwise be readily 
synthesized by chemical methodology alone. Moreover, the present invention provides 
polyketides that have either or both of the desired immunosuppressive and neurotrophic 
activities, some of which are produced only by fermentation and others of which are 
produced by fermentation and chemical modification. Thus, in one aspect, the invention 
provides compounds that optimally bind to FKBP but do not bind to the effector proteins. 
The methods and reagents of the invention can be used to prepare numerous constrained 
cyclic analogs of FK-520 in which the FKBP binding domain is fixed in a conformation 
optimal for binding to FKBP. Such compounds will show neuroimmunophilin binding 
(neurotrophic) but not immunosuppressive effects. The invention also allows direct 
manipulation of FK-520 and related chemical structures via genetic engineering of the 
enzymes involved in the biosynthesis of FK-520 (as well as related compounds, such as 
FK-506 and rapamycin); similar chemical modifications are simply not possible because 
of the complexity of the structures. The invention can also be used to introduce "chemical 
handles" into normally inert positions that permit subsequent chemical modifications. 

Several general approaches to achieve the development of novel 
neuroimmunophilin ligands are facilitated by the methods and reagents of the present 
invention. One approach is to make "point mutations" of the functional groups of the 
parent FK-520 structure that bind to the effector molecules to eliminate their binding 
potential. These types of structural modifications are difficult to perform by chemical 
modification, but can be readily accomplished with the methods and reagents of the 
invention. 

A second, more extensive approach facilitated by the present invention is to 
utilize molecular modeling to predict optimal structures ab initio that bind to FKBP but 
not effector molecules. Using the available X-ray crystal structure of FK-520 (or FK-506) 
bound to FKBP, molecular modeling can be used to predict polyketides that should 
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optimally bind to FKBP but not calcineurin. Various macrolide structures can be 
generated by linking the ends of the FKBP-binding domain with "all possible" polyketide 
chains of variable length and substitution patterns that can be prepared by genetic 
manipulation of the FK-520 or FK-506 PKS gene cluster in accordance with the methods 
of the invention. The ground state conformations of the virtual library can be determined, 
and compounds that possess binding domains most likely to bind well to FKBP can be 
prepared and tested. 

Once a compound is identified in accordance with the above approaches, the 
invention can be used to generate a focused library of analogs around the lead candidate, 
to "fine tune" the compound for optimal properties. Finally, the genetic engineering 
methods of the invention can be directed towards producing "chemical handles" that 
enable medicinal chemists to modify positions of the molecule previously inert to 
chemical modification. This opens the path to previously prohibited chemical 
optimization of lead compounds by time-proven approaches. 

Moreover, the present invention provides polyketide compounds and the 
recombinant genes for the PKS enzymes that produce the compounds that have 
significant advantages over FK-506 and FK-520 and their analogs. The metabolism and 
pharmacokinetics of tacrolimus has been exstensively studied, and FK-520 is believed to 
be similar in these respects. Absorption of tacrolimus is rapid, variable, and incomplete 
from the gastrointestinal tract (Harrison's Principles of Internal Medicine, 14th edition, 
1998, McGraw Hill, 14, 20, 21, 64-67). The mean bioavailability of the oral dosage form 
is 27%, (range 5 to 65%). The volume of distribution (VolD) based on plasma is 5 to 65 
L per kg of body weight (L/kg), and is much higher than the VolD based on whole blood 
concentrations, the difference reflecting the binding of tacrolimus to red blood cells. 
Whole blood concentrations may be 12 to 67 times the plasma concentrations. Protein 
binding is high (75 to 99%), primarily to albumin and alphal-acid glycoprotein. The half- 
life for distribution is 0.9 hour; elimination is biphasic and variable: terminal- 1 1.3 hr 
(range, 3.5 to 40.5 hours). The time to peak concentration is 0.5 to 4 hours after oral 
administration. 
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Tacrolimus is metabolized primarily by cytochrome P450 3 A enzymes in the liver 
and small intestine. The drug is extensively metabolized with less than 1% excreted 
unchanged in urine. Because hepatic dysfunction decreases clearance of tacrolimus, doses 
have to be reduced substantially in primary graft non-function, especially in children. In 
5 addition, drugs that induce the cytochrome P450 3A enzymes reduce tacrolimus levels, 
while drugs that inhibit these P450s increase tacrolimus levels. Tacrolimus bioavailability 
doubles with co-administration of ketoconazole, a drug that inhibits P450 3 A. See, 
Vincent et al, 1992, In vitro metabolism of FK-506 in rat, rabbit, and human liver 
microsomes: Identification of a major metabolite and of cytochrome P450 3A as the 

10 major enzymes responsible for its metabolism, Arch. Biochem. Biophys. 294: 454-460; 
Iwasaki etaL, 1993, Isolation, identification, and biological activities of oxidative 
metabolites of FK-506, a potent immunosuppressive macrolide lactone, Drug Metabolism 
& Disposition 21: 971-977; Shiraga et ai, 1994, Metabolism of FK-506, a potent 
immunosuppressive agent, by cytochrome P450 3 A enzymes in rat, dog, and human liver 

15 microsomes, Biochem. Pharmacol 47: 727-735; and Iwasaki et al, 1995, Further 
metabolism of FK-506 (Tacrolimus); Identification and biological activities of the 
metabolites oxidized at multiple sites of FK-506, Drug Metabolism & Disposition 23: 28- 
34. The cytochrome P450 3 A subfamily of isozymes has been implicated as important in 
this degradative process. 

20 Structures of the eight isolated metabolites formed by liver microsomes are shown 

in Figure 6. Four metabolites of FK-506 involve demethylation of the oxygens on 
carbons 13, 15, and 31, and hydroxylation of carbon 12. The 13-demethylated (hydroxy) 
compounds undergo cyclizations of the 13-hydroxy at C-10 to give MI, MVI and MVII, 
and the 12-hydroxy metabolite at C-10 to give I. Another four metabolites formed by 

25 oxidation of the four metabolites mentioned above were isolated by liver microsomes 
from dexamethasone treated rats. Three of these are metabolites doubly demethylated at 
the methoxy groups on carbons 15 and 31 (M-V), 13 and 31 (M-VI), and 13 and 15 (M- 
VII). The fourth, M-VIII, was the metabolite produced after demethylation of the 3 1 - 
methoxy group, followed by formation of a fused ring system by further oxidation. 

30 Among the eight metabolites, M-II has immunosuppressive activity comparable to that of 
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FK-506, whereas the other metabolites exhibit weak or negligible activities. Importantly, 
the major metabolite of human, dog, and rat liver microsomes is the 13-demethylated and 
cyclized FK-506 (M-I). 

Thus, the major metabolism of FK-506 proceeds via 13-demethylation followed 
by cyclization to the inactive M-I, this representing about 90% of the metabolic products 
after a 10 minute incubation with liver microsomes. Analogs of tacrolimus that do not 
possess a C-13 methoxy group would not be susceptible to the first and most important 
biotransformation in the destructive metabolism of tacrolimus (i.e. cyclization of 13- 
hydroxy to C-10). Thus, a 13-desmethoxy analog of FK-506 should have a longer half- 
life in the body than does FK-506. The C-13 methoxy group is believed not to be 
required for binding to FKBP or calcineurin. The C-13 methoxy is not present on the 
identical position of rapamycin, which binds to FKBP with equipotent affinity as 
tacrolimus. Also, analysis of the 3 -dimensional structure of the FKBP-tacrolimus- 
calcineurin complex shows that the C-13 methoxy has no interaction with FKBP and only 
a minor interaction with calcineurin. The present invention provides C- 13-desmethoxy 
analogs of FK-506 and FK-520, as well as the recombinant genes that encode the PKS 
enzymes that catalyze their synthesis and host cells that produce the compounds. 

These compounds exhibit, relative to their naturally occurring counterparts, 
prolonged immunosuppressive action in vivo, thereby allowing a lower dosage and/or 
reduced frequency of administration. Dosing is more predictable, because the variability 
in FK-506 dosage is largely due to variation of metabolism rate. FK-506 levels in blood 
can vary widely depending on interactions with drugs that induce or inhibit cytochrome 
P450 3A (summarized in USP Drug Information for the Health Care Professional). Of 
particular importance are the numerous drugs that inhibit or compete for CYP 3 A, 
because they increase FK-506 blood levels and lead to toxicity (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Also important are the drugs that induce P450 3A 
(e.g. Dexamethasone), because they decrease FK-506 blood levels and reduce efficacy. 
Because the major site of CYP 3A action on FK-506 is removed in the analogs provided 
by the present invention, those analogs are not as susceptible to drug interactions as the 
naturally occurring compounds. 
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Hyperglycemia, nephrotoxicity, and neurotoxicity are the most significant adverse 
effects resulting from the use of FK-506 and are believed to be similar for FK-520. 
Because these effects appear to occur primarily by the same mechanism as the 
immunosuppressive action (i.e. FKBP-calcineurin interaction), the intrinsic toxicity of the 
5 desmethoxy analogs may be similar to FK-506. However, toxicity of FK-506 is dose 
related and correlates with high blood levels of the drug (Prograf package insert, 
FujisawaDUS, Rev 4/97, Rec 6/97). Because the levels of the compounds provided by 
the present invention should be more controllable, the incidence of toxicity should be 
significantly decreased with the 13-desmethoxy analogs. Some reports show that certain 
10 FK-506 metabolites are more toxic than FK-506 itself, and this provides an additional 
reason to expect that a C YP 3 A resistant analog can have lower toxicity and a higher 
therapeutic index. 

Thus, the present invention provides novel compounds related in structure to FK- 
506 and FK-520 but with improved properties. The invention also provides methods for 

1 5 making these compounds by fermentation of recombinant host cells, as well as the 

recombinant host cells, the recombinant vectors in those host cells, and the recombinant 
proteins encoded by those vectors. The present invention also provides other valuable 
materials useful in the construction of these recombinant vectors that have many other 
important applications as well. In particular, the present invention provides the FK-520 

20 PKS genes, as well as certain genes involved in the biosynthesis of FK-520 in 
recombinant form. 

FK-520 is produced at relatively low levels in the naturally occurring cells, 
Streptomyces hygroscopicus var. ascomyceticus, in which it was first identified. Thus, 
another benefit provided by the recombinant FK-520 PKS and related genes of the 
25 present invention is the ability to produce FK-520 in greater quantities in the recombinant 
host cells provided by the invention. The invention also provides methods for making 
novel FK-520 analogs, in addition to the desmethoxy analogs described above, and 
derivatives in recombinant host cells of any origin. 



30 PKS enzyme, which is composed of the JkbA,jkbB,JkbC, and flcbP gene products, 



The biosynthesis of FK-520 involves the action of several enzymes. The FK-520 
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synthesizes the core structure of the molecule. There is also a hydroxylation at C-9 
mediated by the P450 hydroxylase that is the JkbD gene product and that is oxidized by 
the fkbO gene product to result in the formation of a keto group at C-9. There is also a 



product. There are also methylations at the C-13 and C-15 positions by a 
methyltransferase believed to be encoded by the fkbG gene; this methyltransferase may 
act on the hydroxymalonyl CoA substrates prior to binding of the substrate to the AT 
domains of the PKS during polyketide synthesis. The present invention provides the 



genes encoding the enzymes involved in ethylmalonyl CoA and 2-hydroxymalonyl CoA 
biosynthesis in recombinant form. Moreover, the invention provides Streptomyces 
hygroscopicus var. ascomyceticus recombinant host cells lacking one or more of these 
genes that are useful in the production of useful compounds. 

The cells are useful in production in a variety of ways. First, certain cells make a 
useful FK-520-reIated compound merely as a result of inactivation of one or more of the 
FK-520 biosynthesis genes. Thus, by inactivating the C-31 O-methyltransferase gene in 
Streptomyces hygroscopicus var. ascomyceticus, one creates a host cell that makes a 
desmethyl (at C-31) derivative of FK-520. Second, other cells of the invention are unable 
to make FK-520 or FK-520 related compounds due to an inactivation of one or more of 
the PKS genes. These cells are useful in the production of other polyketides produced by 
PKS enzymes that are encoded on recombinant expression vectors and introduced into 
the host cell. 

Moreover, if only one PKS gene is inactivated, the ability to produce FK-520 or 
an FK-520 derivative compound is restored by introduction of a recombinant expression 
vector that contains the functional gene in a modified or unmodified form. The 
introduced gene produces a gene product that, together with the other endogenous and 
functional gene products, produces the desired compound. This methodology enables one 
to produce FK-520 derivative compounds without requiring that all of the genes for the 
PKS enzyme be present on one or more expression vectors. Additional applications and 
benefits of such cells and methodology will be readily apparent to those of skill in the art 



methylation at C-3 1 that is mediated by an O-methyltransferase that is theyifcftAf gene 



genes encoding these enzymes in recombinant form. The invention also provides the 
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after consideration of how the recombinant genes were isolated and employed in the 
construction of the compounds of the invention. 

The FK-520 biosynthetic genes were isolated by the following procedure. 
Genomic DNA was isolated from Streptomyces hygroscopicus var. ascomyceticus 
(ATCC 14891) using the lysozyme/proteinase K protocol described in Genetic 
Manipulation of Streptomyces - A Laboratory Manual (Hopwood et al., 1986). The 
average size of the DNA was estimated to be between 80- 120 kb by electrophoresis on 
0.3% agarose gels. A library was constructed in the SuperCos™ vector according to the 
manufacturer's instructions and with the reagents provided in the commercially available 
kit (Stratagene). Briefly, 100 jag of genomic DNA was partially digested with 4 units of 
Sau3A I for 20 min. in a reaction volume of 1 mL, and the fragments were 
dephosphorylated and ligated to SuperCos vector arms. The ligated DNA was packaged 
and used to infect log-stage XLl-BlueMR cells. A library of about 10 5 000. independent 
cosmid clones was obtained. 

Based on recently published sequence from the FK-506 cluster (Motamedi and 
Shafiee, 1998, Eur J. Biochem. 256: 528), a probe for the fkbO gene was isolated from 
ATCC 14891 using PCR with degenerate primers. With this probe, a cosmid designated 
pKOS034-124 was isolated from the library. With probes made from the ends of cosmid 
pKOS034-124, an additional cosmid designated pKOS034-120 was isolated. These 
cosmids (pKOS034-124 and pKOS034-120) were shown to contain DNA inserts that 
overlap with one another. Initial sequence data from these two cosmids generated 
sequences similar to sequences from the FK-506 and rapamycin clusters, indicating that 
the inserts were from the FK-520 PKS gene cluster. Two EcoRI fragments were 
subcloned from cosmids pKOS034-124 and pKOS034-120. These subclones were used 
to prepare shotgun libraries by partial digestion with Sau3AI, gel purification of 
fragments between 1.5 kb and 3 kb in size, and ligation into the pLitmus28 vector (New 
England Biolabs). These libraries were sequenced using dye terminators on a Beckmann 
CEQ2000 capillary electrophoresis sequencer, according to the manufacturer's protocols. 

To obtain cosmids containing sequence on the left and right sides of the 
sequenced region described above, a new cosmid library of ATCC 14891 DNA was 
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prepared essentially as described above. This new library was screened with a nev/flcbM 
probe isolated using DNA from ATCC 14891. A probe representing the JkbP gene at the 
end of cosmid pKOS034-124 was also used. Several additional cosmids to the right of the 
previously sequenced region were identified. Cosmids pKOS065-C31 and pKOS065-C3 
5 were identified and then mapped with restriction enzymes. Initial sequences from these 
cosmids were consistent with the expected organization of the cluster in this region. More 
extensive sequencing showed that both cosmids contained in addition to the desired 
sequences, other sequences not contiguous to the desired sequences on the host cell 
chromosomal DNA. Probing of additional cosmid libraries identified two additional 

1 0 cosmids, pKOS065-M27 and pKOS065-M2 1 , that contained the desired sequences in a 
contiguous segment of chromosomal DNA. Cosmids pKOS034-124, pKOS034-120, 
pKOS065-M27, and pKOS065-M21 have been deposited with the American Type 
Culture Collection, Manassas, VA, USA. The complete nucleotide sequence of the 
coding sequences of the genes that encode the proteins of the FK-520 PKS are shown 

1 5 below but can also be determined from the cosmids of the invention deposited with the 
ATCC using standard methodology . 

Referring to Figures 1 and 3, the FK-520 PKS gene cluster is composed of four 
open reading frames designated fkbB,fkbC,JkbA, and flcbP. The flzbB open reading frame 
encodes the loading module and the first four extender modules of the PKS. The jkbC 

20 open reading frame encodes extender modules five and six of the PKS. The jkbA open 
reading frame encodes extender modules seven, eight, nine, and ten of the PKS. The flcbP 
open reading frame encodes the NRPS of the PKS. Each of these genes can be isolated 
from the cosmids of the invention described above. The DNA sequences of these genes 
are provided below preceded by the following table identifying the start and stop codons 

25 of the open reading frames of each gene and the modules and domains contained therein. 

Nucleotides Gene or Domain 

complement (412- 1836) jkbW 

complement (2020 - 3579) flcbV ■ 

30 complement (3969 - 4496) fkbR2 

complement (4595 - 5488) jkbRl 

5601 -6818 jkbE 
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6808 - 8052 
8156-8824 

complement (9122 - 9883) 
complement (9894 - 10994) 
5 complement (10987 - 1 1247) 
complement (1 1244 - 12092) 
complement (121 13 - 13150) 
complement (13212 - 23988) 
complement (23992 - 46573) 

10 46754-47788 
47785 - 52272 
52275 -71465 
71462-72628 
72625 - 73407 

1 5 complement (73460 - 76202) 
complement (76336 - 77080) 
complement (77076 - 77535) 
complement (44974 - 46573) 
complement (43777 - 44629) 

20 complement (43144 - 43660) 
complement (41 842 - 43093) 
complement(40609 - 41842) 
complement (39442 - 40609) 
complement (38677 - 39307) 

25 complement (38371 - 38581) 
complement (37145 - 38296) 
complement (35749 - 37144) 
complement (34606 - 35749) 
complement (33823 - 34480) 

30 complement (33505 -33715) 
complement (32185 - 33439) 
complement (31018-32185) 
complement (29869 -31018) 
complement (29092 - 29740) 

35 complement (28750 - 28960) 
complement (27430 - 28684) 
complement (26146 - 27430) 
complement (24997 - 26146) 
complement (24163 - 24373) 

40 complement (22653 - 23892) 
complement (21420 - 22653) 
complement (20241 - 21420) 
complement (19464 - 20097) 
complement (191 16 - 19326) 



JkbF 

fkbG 

flcbH 

fkbl 

flcbJ 

flcbK 

flcbL 

flcbC 

JkbB 

fkbO 

JkbP 

flcbA 

fkbD 

flcbM 

JkbN 

jkbQ 

JkbS 

CoA ligase of loading domain 

ER of loading domain 

ACP of loading domain 

KS of extender module 1 (KS1) 

ATI 

DH1 

KR1 

ACPI 

K.S2 

AT2' 

DH2 (inactive) 

KR2 

ACP2 

K.S3 

AT3 

DH3 (inactive) 

KR3 

ACP3 

KS4 

AT4 

DH4 (inactive) 

ACP4 

KS5 

AT5 

DH5 

KR5 

ACP5 
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complement (17820- 


19053) 


KS6 




complement (16587 - 


17820) 


AT6 




complement (15438 - 


16587) 


DH6 




complement (14517 - 


15294) 


ER6 


5 


complement (13761 - 


14394) 


KR6 




complement (13452 - 


13662) 


ACP6 




52362 - 53576 




KS7 




53577-54716 




AT7 




54717-55871 




DH7 


10 


56019-56819 




ER7 




56943 - 57575 




KR7 




57710-57920 




ACP7 




57990 - 59243 




KS8 




59244 - 60398 




AT8 


15 


60399-61412 




DH8 (inactive) 




61548-62180 




KR8 




62328 - 62537 




ACP8 




62598 - 63854 




KS9 




63855 - 65084 




AT9 


20 


65085 - 66254 




DH9 




66399-67175 




ER9 




67299-67931 




KR9 




68094 - 68303 




ACP9 




68397 - 69653 




KS10 


25 


69654 - 70985 




AT10 




71064-71273 




ACP10 



1 GATCTCAGGC ATGAAGTCCT CCAGGCGAGG CGCCGAGGTG GTGAACACCT CGCCGCTGCT 
61 TGTACGGACC ACTTCAGTCA GCGGCGATTG CGGAACCAAG TCATCCGGAA TAAAGGGCGG 
30 121 TTACAAGATC CTCACATTGC GCGACCGCCA GCATACGCTG AGTTGCCTCA GAGGCAAACC 

181 GAAAGGGCGC GGGCGGTCCG CACCAGGGCG GAGTACGCGA CGAGAGTGGC GCACCCGCGC 
0 t> 241 ACCGTCACCT CTCTCCCCCG CCGGCGGGAT GCCCGGCGTG ACACGGTTGG GCTCTCCTCG 

Q\\J 301 ACGCTGAACA CCCGCGCGGT GTGGCGTCGG GGACACCGCC TGGCATCGGC CGGGTGACGG 

361 TACGGGGAGG GCGTACGGCG GCCGTGGCTC GTGCTCACGG CCGCCGGGCG GTCATCCGTC 
35 421 GAGACGGCAC TCGGCGAGCA GGGACGCCTG GTCGGCACCT GCGGGCCGGA CGACCGTGTG 

4 81 GTTCGCGGGC GGGCGGTGGC CGGTGGTGAG CCAGCTCTCC AGGGCGGTGA AGGCTGAGCG 
541 GTGACACGGC AGCAAAGGCC GGAGTCGGTC GGGGAAGGTG TCGACGAGGG CGTCGGTGTG 
601 CGTGCCGTCC TCGATGCGGT AGTAGCGGTA CCGGCCGCCA GGCCGCTGCC GGACATACGC 
661 GCGTACACGT CGGAGCCCGG GCGGCAGGCA GCAGCACGTC GAGAGTGCCT GGATGGTGAT 
40 721 CAGCGGCTTG CCGATACGAC CGGTCAACGC GATGCGTTCC ACGGCCGCGT GGACGCCGGA 

781 GGAGCGGGTG GCGTAGTCGT AGTCGGCATC GCAGCCCGGG ACCGTCCCCG GGGCGCAATA 
841 CGGTGTGCCG GCTTCCTTCT CCCCATCGAA GCCGGGGTCG AACTCCTCGC GGTAGACGCG 
901 CTGCGTCAGA TCCCAGTAGA CCTCGTGGTG GTACGGCCAC AAGAACTCGG AGTCGGCCGG 
961 GAACCCGGCG CGGAGCAGCG CCTCGCGCGC CTGGCCGGCT GCGGGGCCGC CTGCCGCGTA 
45 1021 GGTGGGGTAG TCGCGCAGGG CGGCCGGCAG GAAGGTGAAG AGGTTGGGAC CCTCCGCGCG 
1081 CCACAGGGTG CCTTCCCAGT CGACTCCTCC GTCGTACAGC TCGGGATGGT TCTCCAGCTG 
1141 CCAGCGCACG AGGTAGCCGC CGTTGGACAT CCCGGTGACC AGGGTGCGCT CGAGCGGCCG 
1201 GTGGTAGCGC TGGGCGACCG ACGCGCGGGC GGCCCGGGTC AGCTGGGTGA GGCGGGTGTT 
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1261 CCACTCGGCG ACGGCGTCGC CCGGCCGGGA GCCATCACGG TAGAACGCGG GGCCGGTGTT 
1321 GCCCTTGTCG GTGGCGGCGT AGGCGTAACC GCGGGCGAGC ACCCAGTCGG CGATGGCCCG 
1381 GTCGTTGGCG TACTGCTCGC GGTTACCGGG GGTGCCGGCC ACGACCAGGC CACCGTTCCA 
14 41 GCGGTCGGGC AGCCGGATGA CGAACTGGGC GTCGTGGTTC CACCCGTGGT TGGTGTTGGT 
1501 GGTGGAGGTG TCGGGGAAGT AGCCGTCGAT CTGGATCCCG GGCACTCCGG TGGGAGTGGC 
1561 CAGGTTCTTG GGCGTCAGCC CTGCCCAGTC CGCCGGGTCG GTGTGGCCGG TGGCCGCCGT 
1621 TCCCGCCGTG GTCAGCTCGT CCAGGCAGTC GGCCTGCTGA CGTGCCGCCG CCGGGACACG 
1681 CAGCTGGGAC AGACGGGCGC AGTGACCGTC CGGGGCATCG GGAGCAGGCC GGGCCGTGGC 

17 41 CGGTGAGGGG AGCAGGACGG CGACTGCGGC CAGGGTGAGA GCGCCGAGGC CGGTGCGTCT 
1801 TCTCGGGGCC CGTCCGACAC CGAGGGGCAG AACCATGGAG AGCCTCCAGA CGTGCGGATG 

18 61 GATGACGGAC TGGAGGCTAG GTCGCGCACG GTGGAGACGA ACATGGGTGC GCCCGCCATG 
1921 ACTGAGGCCC CTCAGAGGTG GGCCGCCGCC ATGACGGGCG CGGGACCGCG GGCGCTCCGG 
1981 GGCGGTGCCC GCGGCCGCCA CCGGTTCCGG GTCCCCGGGT CAGGGACAGG TGTCGTTCGC 
2041 GACGGTGAAG TAGCCGGTCG GCGACTCTTT CAAGGTGGTC GTGACGAAGG TGTTGTACAG 
2101 GCCCATGTTC TGGCCGGAGC CCTTGGCGTA GGTGTAACCG GCGCTCGTCG TGGCGCGGCC 
2161 CGCCTGGACG TGAGCGTAGT TGCCGGCGGT CCAGCAGACG GCCGTGGCAC CGGTCGTCTG 
2221 CGCGGTGACC GCGCCCGAGA GCGGTCCGGC CTTGCCGTCC GCGTCCCGGG CGGCGACCGC 
2281 GTAGGTGTGC GATGTGCCCG CCCTCAGGCC GGTGTCCGTG TACGACGTCG TGGCGGACGT 
2341 GGTGATCTGG GCACCGTCGC GGTGGACGGC GTAGTCGGTG GCGCCGTCGA CGGGTTTCCA 
24 01 GGTCAGGCTG ATGGTGGTGT CGGTGGCGCC GGTGGCGGCC AGGCCGGACG GAGCGGGCAG 
24 61 CGAACCGGGG TCGGAGGCGG ATCCGCTCAG GCCGAAGAAC TGCGTGATCC AGTAGCTGGA 
2521 ACAGATCGAG TCCAGGAAGT AGGCGGCGCC GGTGCTGCCG CACTGCTGTG CTCCGGTGCC 
2581 GGGATCGACC GGGGTGCCGT GCCCGATGCC CGGCACCCGG TTCACCTCCA CGGCCACCGA 
2641 TCCGTCCGCG GCCAGGTACT CCTCGTGCCG GGTGGAGTTC GGGCCGATCA CCGAGGTACG 
27 01 GTCCGGCGTC TGGGACACGC CGTGCACAGC GGTCCACTGG TCGCGCAACT CGTCGGCGTT 
27 61 GCGCGGCGCG ACGGTGGTGT CCTTGTCGCC GTGCCAGATG GCCACGCGCG GCCACGGGCC 
2821 CGACCACGAG GGGTAGCCGT CACGGACCCG CCGCGCCCAC TGGTCCGCGG TCAGGTCGGT 
2881 CCCGGGGTTC ATGCACAGGT ACGCGCTGCT GACGTCGGTG GCACAGCCGA AGGGCAGGCC 
2941 GGCGACGACC GCGCCGGCCT GGAAGACGTC CGGATAGGTG GCGAGCATCA CCGACGTCAT 
3001 GGCACCGCCG GCGGACAGCC CGGTGATGTA GGTGCGCTGG GGGTCCGCGC CGTAGGCGGA 
3061 GACGGTGTGA GCGGCCATCT GCCGGATCGA CGCGGCTTCG CCCTGGCCCC TGCGGTTGTC 
3121 GCTGCTCTGG AACCAGTTGA AGCACCTGTT CGCGTTGTTC GACGACGTGG TCTCGGCGAA 
3181 CACGAGCAGG AAGCCATAGC GGTCCGCGAA TGAGAGCAGG CCGGAGTTGT CGGCGTAGCC 
3241 CTGGGCGTCC TGGGTGCAAC CGTGCAGGGC GAACACCACC GCCGGCTCCG CGGGCAGGGA 
3301 CGCGGGCCGG TAGACGTACA TGTTCAGCCG GCCCGGGTTC GTGCCGAAGT CCGCGACCTC 
3361 GGTCAGGTCC GCCTTGGTCA GACCGGGCTT GGCCAGGCCC GCCGCGGCGT GGGCCGTCGG 
34 21 CGCCGGGCCG AGCAGGGCCG CTCCGAGTAC GAGGGCCACG ACGGCCACGA GACGGGTGAG 
34 81 CACCCCCCGC CGTCCCGGAC GCGACAACGA CCCGACCGGC GGCGAGGAGG AGAGGGGGAA 
3541 CAGCGGGGTG AGGATTCCCC GGAACGGCGG CGGCTGCATG GCGGCTCCCT CGATGTCGTG 
3601 GGGGGGACAC GGAGGGCTCC CTGACGTCGA TCAGTGGGAG CGCCCCGGTG CCCGGCACCG 
3661 TAGGGGTGGT TCAACCCGCA ACGGTATGGC CCGGAGCACC ACACCCCGCA CCGCGCGATG 
3721 TGCGCCCGGA CGGATTGTGT CGCCTTGCGG AATCTGATAC CCGGACGCGA CGAACGCCCC 
37 81 ACCCGACACG GGTAGGGCGT CATGGTGTCC GACTCGGCCG GTCGGCCTTG CCTGCCCTGG 
3841 ACGGACCGGG CGTCGGCGGA CCGGGCGTCG GCGGGCTGGG CGGTATGGCG GCCGAGGACG 
3901 CCAGCCGCGT GGGGCGGCCG CGCCCAAGTG CAGTACGCCG ACCGTGGCCG GCGGGAGGGC 
3961 CGGACCGGTC AGTGCAGTCC CGCGGCCCTG CGGGACCGCT CGTCCCAGAC GGGTTCCACC 
4 021 GCGGCGAACC GGGGTCCGTG TCCGCGGCGG TAG AC CATC A GTGTCCGCTC GAAGGTGATG 
4 081 ACGATGACAC CGTCCTGGTT GTAGCCGATG GTGCGCACGC TGATGATGCC TACGTCAGGT 
4141 CGGCTGGCGG ACTCCCGGGT GTTCAGGACC TCGGACTGCG AGTAGATGGT GTCGCCCTCG 
4201 AAGACCGGGT TCGGCAGCCT GACCCGGTCC CAGCCGAGGT TGGCCATCAC ATGCTGGGAG 
4 2 61 ATGTCGGTGA CGCTCTGCCC GGTGACCAGG GCGAGGGTGA AGGTGGAGTC CACCAGCGGC 
4 321 TTGCCCCAGG TGGTGCCCGC CGAGTAGTGG CGGTCGAAGT GCAGCGGCGC GGTGTTCTGC 
4 381 GTCAGGAGCG TGAGCCAGGA GTTGTCGGTC TCCAGGACCG TGCGGCCCAG GGGGTGGCGG 
4 441 TACACGTCGC CGGTGGTGAA GTCCTCGAAG TAGCGGCCCT GCCAGCCCTC GACCACAGCG 
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4501 GTGCGGGTGG CGTCCTGGTC 
4 561 CGGTCCGCTG TGAAATGCCG 
4 621 ACCGTACGTA GTCGTAGAAC 
4 681 CCACGCCGAC CGTGCGCCGC 
4741 CGGGCCCGGA CGGGCTGCCG 
4 801 GGGCCCGCAG CGTGCTCAGC 
4 8 61 CGGCGCACAG CCGGTCGGTG 
4 921 CCTCATCGGC CAGCTCCGCG 
4 981 GGACGAGCAG GCACAGTGCC 
5041 GTCGTGGGCT GGTCAGCCCC 
5101 CGGCGGCGTC GCCGCGCAGT 
5161 GGAGGTCGGG CACCAGCCAG 
5221 TGTCGGGGTC GATCAGGGCG 
5281 GCAGGGCGTG GGCGCGGAAG 
5341 GGTCGAACAG CGGCACGCCC 
54 01 GCTGGGAGAT GTTGAGCCGT 
54 61 TGAACCACTG CAACTCCCGT 
5521 CGAGGTTTCG TCATTTCACA 
5581 GACCCCATGG GAGGGACCCC 
5641 CCGGGCCCCT GTCCGGTCTG 
57 01 CCACCCGCCA CCTGGCGGAC 

57 61 GCGACCTCGC CCGCGGCTAC 
5821 TGAACCGGGG GAAGGAGAGC 

58 81 TGCACGCCTT GGTGGACCGG 
5941 GCCGCCTGGC ATCGGCCACC 
6001 CATATCCGGC TACGGCAGTA 
60 61 TCCAGTGCGA AGCGGGGCTG 
6121 GCCTGTCCAT CGCGGACATC 
6181 TGCTGAAGCG GGCCCGCACC 
6241 TCGGTGAATG GATGGGATAC 
6301 GCGCCGGCGC CAGCCACGCG 
6361 AGACGATCAA TCTCGGGCTC 
64 21 TACAACGCCC CGGTCTCTGC 
64 81 ACCGCACCGA GCTCGACGCC 
6541 TGGTGGCGCG GCTGGAGGAG 
6601 TCAGCGAACA CCCCCAACTG 
6661 GTGCGCTGGA GGGCCTGATC 
6721 GCCGGGTCCC GGAGCTGGGC 

67 81 ACAGCGCCGA CCGCGAAGAG 

68 41 GCCGCCGTGT TCCTGCTCGC 
6901 GCCACCTTTC TGCTCGGGGT 
6961 TTCCCCGCGA GCATGTTCCT 
7 021 GTCAACGGCA CGGTGGACTG 
7 081 GGAGCCGTCC CCTGGGTGCT 
7141 TCGCCCGCGG CGGTGGCGAT 
7201 ATCGATCCGC TGTACGCCGG 
7261 CCCTCCGGGA TCCTGGGCGG 
7 321 AGCGGCGGGC TGCTCTTCGC 
7 381 TGGCTCGTCC TCGGGCGCAG 
74 41 ACGGAAGGGG ACCCGGCTTC 
7 501 GCCGCGCTGG TGCTGGGAAC 
7561 TTGGCGGCGT TGCTGGCGCT 
7 621 GCCTGGCCCG TGGTGCTGCT 
7 681 CTGGGCATCG TGGACTCCCT 
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CGGGTTCTCA GTCGTCATGG CGCTCATTCT GGGAAGTCCC 
AACCTTCACC GGGCTCATAC GTGCGGCGCA TGAGCCCTGG 
CTCGCCACCA CTGGCGCGCG TGGTCCTCCG GCGAGTGTGA 
GCCTGCGGGT CGTCGAGCGG CACGGCGACG GCGTGGTCAC 
GTGAGGGGGG CGACGGCCAC ACCGAGGCCG GCGGCGACCA 
TCGGTGCTCT CCAGGACGAC CCGCGGCACG AATCCGGCCG 
ATCTGGCGCA GTCCGAAGAC CGGCTCCAGT GCCACGAACG 
GTCCGCACCC GGCGGCGTCT GGCCAGCCGG TGTCCGGGTG 
TCGTCCCGCA GTGGTGTCCA CTCCACATCG TCCCCGGCGG 
AGGTCCAGCC TGCTGTTGCG GACGTCGTCG ACCACGGCGT 
TCGAAGGTGG TGCCGGGAGC CAGCCGGCGG TACCCGGCGA 
GTGCCGTAGG AGTGCAGGAA ACCCAGTGCC ACGGTGCCGG 
GTGATGCGCT GCTCGGCGCC GGAGACCTCA CTGATCGCGC 
ACCTCGCCGT ACTTGTTGAG CCGGAGCCGG TTCTGGTGCC 
ACTCGTCGCT CCAGCCGCCG GATGGCCCTG GACAGGGTCG 
TCCGCGGTGA TCGTCACGTG CTCGTGCTCG GCCAAGGCCG 
ATCTCCATGC AGGGACTATA CGTACCGGGC ATGGTCCTGG 
GCGGCCGGGC GGCGGCCCAC AGTGAGTCCT CACCAACCAG 
ATGTCCGAGC CGCATCCTCG CCCTGAACAG GAACGCCCCG 
CTCGTGGTTT CTTTGGAGCA GGCCGTCGCC GCTCCGTTCG 
CTGGGCGCCC GTGTCATCAA GATCGAACGC CCCGGCAGCG 
GACCGCACGG TGCGTGGCAT GTCCAGCCAC TTCGTCTGGC 
GTCCAGCTCG ATGTGCGCTC GCCGGAGGGC AACCGGCACC 
GCCGATGTCC TGGTGCAGAA TCTGGCACCC GGCGCCGCGG 
AGGTCCTCGC GCGGAGCCAC CGAGGCTGAT CACCTGCGGA 
CCGGCTGCTA CCGCGGACCG CAAGGCGTAC GACCTCCTGG 
GTCTCCATCA CCGGCACCCC CGAGACCCCG TCCAAGGTGG 
TGTGCGGGGA TGTACGCGTA CTCCGGCATC CTCACGGCCC 
GGCCGGGGCT CGCAGTTGGA GGTCTCGATG CTCGAAGCCC 
GCCGAGTACT ACACGCGCTA CGGCGGCACC GCTCCGGCCC 
ACGATCGCCC CCTACGGCCC GTTCACCACG CGCGACGGGC 
CAGAACGAGC GGGAGTGGGC TTCCTTCTGC GGTGTCGTGC 
GACGACCCGC GCTTTTCCGG CAACGCCGAC CGGGTGGCGC 
CTGGTGAGCG AGGTGACGGG CACGCTCACC GGCGAGGAAC 
GCGTCGATCG CCTACGCACG CCAGCGCACC GTGCGGGAGT 
CGTGACCGTG GACGCTGGGC TCCGTTCGAC AGCCCGGTCG 
CCCCCGGTCA CCTTCCACGG CGAGCACCCG CGGCGGCTGG 
GAGCATACCG AGTCCGTCCT GGCGTGGCTG GCCGCGCCCC 
GCCGGCCATG CCGAATGAAC TCACCGGAGT CCTGATCCTG 
CGGCGTACGG GGGCTGAACA TGGGCCTGCT CGCGCTGGTC 
GGTCGCACTC GACCGAACGC CGGACGAGGT GCTGGCGGGT 
GGTGCTGGTC GCCGTCACGT TCCTCTTCGG GATCGCCCGC 
GCTGGTACGT GTCGCGGTGC GGGCGGTGGG GGCCCGGGTG 
CTTCGGCCTG GCGGCACTGC TCTGCGCGAC AGGCGCGGCC 
CGTGGCGCCG ATCAGCGTCG CGTTCGCCGT CAGGCACCGC 
ACTGATGGCG GTGAACGGGG CCGCAGCCGG CAGTTTCGCC 
CATCGTCCAC TCGGCGCTGG AGAAGAACCA TCTGCCCGTC 
AGGCACCTTC GCCTTCAACC TGGCGGTCGC CGCGGTGTCA 
GCGCCTCGAA CCACATGACC TGGACGAGGA CACCGATCCC 
CCGCCCCGGC GCGGAACACG TGATGACGCT GACCGCGATG 
CACGGTCCTC TCCCTGGACA CCGGCTTCCT GGCCCTCACC 
GCTCTTCCCG CGCACCTCCC AGCAGGCCAC CAAGGAGATC 
GGTATGCGGG ATCGTGACCT ACGTCGCCCT GCTCCAGGAG 
GGGGAAGATG ATCGCGGCGA TCGGCACCCC GCTGCTGGCC 
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7741 GCCCTGGTGA TCTGCTACGT GGGCGGTGTC GTCTCGGCCT TCGCCTCGAC CACCGGGATC 
7 801 CTCGGTGCCC TGATGCCGCT GTCCGAGCCG TTCCTGAAGT CCGGTGCCAT CGGGACGACC 
7 8 61 GGCATGGTGA TGGCCCTGGC GGCCGCGGCG ACCGTGGTGG ACGCGAGTCC CTTCTCCACC 
7 921 AATGGTGCTC TGGTGGTGGC CAACGCTCCC GAGCGGCTGC GGCCCGGCGT GTACCAGGGG 

7 981 TTGCTGTGGT GGGGCGCCGG GGTGTGCGCA CTGGCTCCCG CGGCCGCCTG GGCGGCCTTC 
8041 GTGGTGGCGT GAGCGCAGCG GAGCGGGAAT CCCCTGGAGC CCGTTTCCCG TGCTGTGTCG 
8101 CTGACGTAGC GTCAAGTCCA CGTGCCGGGC GGGCAGTACG CCTAGCATGT CGGGCATGGC 
8161 TAATCAGATA ACCCTGTCCG ACACGCTGCT CGCTTACGTA CGGAAGGTGT CCCTGCGCGA 
8221 TGACGAGGTG CTGAGCCGGC TGCGCGCGCA GACGGCCGAG CTGCCGGGCG GTGGCGTACT 
8281 GCCGGTGCAG GCCGAGGAGG GACAGTTCCT CGAGTTCCTG GTGCGGTTGA CCGGCGCGCG 
8341 TCAGGTGCTG GAGATCGGGA CGTACACCGG CTACAGCACG CTCTGCCTGG CCCGCGGATT 

8 4 01 GGCGCCCGGG GGCCGTGTGG TGACGTGCGA TGTCATGCCG AAGTGGCCCG AGGTGGGCGA 
84 61 GCGGTACTGG GAGGAGGCCG GGGTTGCCGA CCGGATCGAC GTCCGGATCG GCGACGCCCG 
8521 GACCGTCCTC ACCGGGCTGC TCGACGAGGC GGGCGCGGGG CCGGAGTCGT TCGACATGGT 
8581 GTTCATCGAC GCCGACAAGG CCGGCTACCC CGCCTACTAC GAGGCGGCGC TGCCGCTGGT 
8641 ACGCCGCGGC GGGCTGATCG TCGTCGACAA CACGCTGTTC TTCGGCCGGG TGGCCGACGA 
8701 AGCGGTGCAG GACCCGGACA CGGTCGCGGT ACGCGAACTC AACGCGGCAC TGCGCGACGA 
87 61 CGACCGGGTG GACCTGGCGA TGCTGACGAC GGCCGACGGC GTCACCCTGC TGCGGAAACG 
8821 GTGACCGGGG CGATGTCGGC GGCGGTCAGC GTCAGCGTCG TCGGCGCGGG CCTCGCGGAG 
8881 GGCTCCAGAT GCAGGCGTTC GACGCCGGCG GCGGAAGCGC CCGCCACCTC GGACACGCAG 
8 941 GGGCAGTCGG AGTCCGCGAA GCCCGCGAAC CGGTAGGCGA TCTCCATCAT GCGGTTGCGG 
9001 TCCGTACGCC GGAAGTCCGC CACCAGGTGC GCCCCCGCGC GGGCGCCCTG GTCCGTGAGC 
9061 CAGTTCAGGA TCGTCGCACC GGCACCGAAC GACACGACC.C GGCAGGACGT GGCGAGCAGT 
9121 TTCAGGTGCC ACGTCGACGG CTTCTTCTCC AGCAGGATGA TGCCGACGGC GCCGTGCGGG 
9181 CCGAAGCGGT CGCCCATGGT GACGACGAGG ACCTCATGGG CGGGATCGGT GAGCACGCGC 
9241 GCAGGTCGGC GTCGGAGTAG TGCACGCCGG TCGCGTTCAT CTGGCTGGTC CGCAGCGTCA 
9301 GTTCCTCGAC GCGGCTGAGT TCCTCCTCCC CCGCGGGTGC GATCGTCATG GAGAGGTCGA 
9361 GCGAGCGCAG GAAGTCCTCG TCGGGACCGG AGTACGCCTC CCGGGCCTGG TCGCGCGCGA 
94 21 AACCCGCCTG GTACATCAGG CGGCGCCGAC GCGAGTCGAC CGTGGACACC GGCGGGCTGA 
94 81 ACTCCGGCAG CGACAGGAGC GTGGCCGCCT GCTCGGCCGG GTAGCACCGC ACCTCGGGCA 
9541 GGTGGAACGC CACCTCGGCA CGCTCGGCGG GCTGGTCGTC GATGAACGCG ATCGTGGTCG 
9601 GTGCGAAGTT CAGCTCCGTG GCGATCTCGC GGACGGACTG CGACTTCGGC CCCCATCCGA 
9661 TGCGGGCCAG CACGAAGTAC TCCGCCACAC CGAGGCGTTC CAGACGCTCC CACGCGAGGT 
9721 CGTGGTCGTT CTTGCTCGCC ACCGCCTGGA GGATGCCGCG GTCGTCGAGC GTGGTGATCA 
9781 CCTCGCGGAT CTCGTCGGTG AGGACCACCT CGTCGTCCTC CAGCACGGTG CCCCGCCACA 
9841 AGGTGTTGTC CAGGTCCCAG ACCAGACACT TGACAATGGT CATGGCTGTC CTCTCAAGCC 
9901 GGGAGCGCCA GCGCGTGCTG GGCCAGCATC ACCCGGCACA TCTCGCTGCT GCCCTCGATG 
9961 ATCTCCATGA GCTTGGCGTC GCGGTACGCC CGTTCGACGA CGTGTCCCTC TCTCGCGCCT 

10021 GCCGACGCGA GCACCTGTGC GGCGGTCGCG GCCCCGGCGG CGGCTCGTTC GGCGGCGACG 
10081 TGCTTGGCCA GGATCGTCGC GGGCACCATC TCGGGCGAGC CCTCGTCCCA GTGGTCGCTG 
10141 GCGTACTCGC ACACGCGGGC CGCGATCTGC TCCGCGGTCC ACAGGTCGGC GATGTGCCCG 
10201 GCGACGAGTT GGTGGTCGCC GAGCGGCCGG CCGAACTGCT CCCGGGTCCG GGCGTGGGCC 
10261 ACCGCGGCGG TGCGGCAGGC CCGCAGGATC CCGACGCAGC CCCAGGCGAC CGACTTGCGC 
10321 CCGTAGGCGA GTGACGCCGC GACCAGCATC GGCAGTGACG CGCCGGAGCC GGCCAGGACC 
10381 GCGCCGGCCG GCACACGCAC CTGGTCCAGG TGCAGATCGG CGTGGCCGGC GGCGCGGCAG 
10441 CCGGACGGCT TCGGGACGCG CTCGACGCGT ACGCCGGGGG TGTCGGCGGG CACGACCACC 
10501 ACCGCACCGG AACCATCCTC CTGGAGACCG AAGACGACCA GGTGGTCCGC GTAGGCGGCG 
10561 GCAGTCGTCC AGACCTTGTG GCCGTCGACG ACAGCGGTGT CCCCGTCGAG CCGAACCCGC 
10621 GTCCGCATCG CCGACAGATC GCTGCCCGCC TGCCGCTCAC TGAAGCCGAC GGCCGCGAGT 
10681 TTCCCGCTGG TCAGCTCCTT CAGGAAGGTC GCCCGCTGAC CGGCGTCGCC GAGCCGCTGC 
10741 ACGGTCCACG CGGCCATGCC CTGCGACGTC ATGACACTGC GCAGCGAACT GCAGAGGCTG 
10801 CCGACGTGTG CGGTGAACTC GCCGTTCTCC GGGCTGCCGA GTCCCAGACC GCCGTGCTCG 
10861 GCCGCCACTT CCGCGCAGAG CAGGCCGTCG GCGCCGAGCC GGACGAGCAG GTCGCGCGGC 
10921 AGTTCGCCGG ACGTGTCCCA CTCGGCGGCC CGGTCACCGA CAAGGTCGGT CAGCAGCGCG 
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10981 TCACGCTCAG GCATCGACGG CCCGCAGCCG GTGGACGAGT GCGACCATGG ACTCGACGGT 
11041 ACGGAAGTTC GCGAGCTGGA GGTCCGGGCC GGCGATCGTG ACGTCGAACG TCTTCTCCAG 
11101 GTACACGACC AGTTCCATCG CGAACAGCGA CGTGAGGCCG CCCTCCGCGA ACAGGTCGCG 
11161 GTCCACGGGC CAGTCCGACC TGGTCTTCGT CTTGAGGAAC GCGACCAACG CGTGCGCGAC 
5 11221 GGGGTCGTCC TTGACGGGTG CGGTCATGAG AACACCTTCT CGTATTCGTA GAAGCCCCGG 
11281 CCGGTCTTCC GGCCGTGGTG TCCCTCGCGG ACCTTGCCCA GCAGCAGGTC ACAGGGGCGG 
11341 CTGCGCTCGT CGCCGGTGCG TTTGTGCAGC ACCCACAGCG CGTCGACGAG GTTGTCGATG 
114 01 CCGATCAGGT CCGCGGTGCG CAGCGGCCCG GTCGGATGGC CGAGGCACCC CGTCATGAGC 
114 61 GCGTCGACGT CCTCGACGGA CGCGGTGCCC TCCTGCACGA TCCGCGCCGC GTCGTTGATC 

10 11521 ATCGGGTGGA GCAGCCGGCT CGTGACGAAG CCGGGCGCGT CCCGGACGAC GATCGGCTTG 
11581 CGCCGCAGCG CCGCGAGCAG GTCCCCGGCG GCGGCCATGG CCTTCTCACC GGTCCGGGGT 
11641 CCGCGGATCA CCTCGACCGT CGGGATCAGG TACGACGGGT TCATGAAGTG CGTGCCGAGC 
11701 AGGTCCTCGG GCCGGGCCAC GGAGTCGGCC AGTTCGTCAA CCGGGATCGA CGACGTGTTC 
117 61 GTGATGACCG GGATACCGGG CGCCGCTGCC GAGACCGTGG CGAGTACCTC CGCCTTGACC 

15 11821 TCGGCGTCCT CGACGACGGC CTCGATCACC GCGGTGGCCG TACCGATCGC GGGCAGCGCG 
11881 GACGTGGCCG TCCGCAGCAC ACCGGGGTCG GCCTCGGCGG GCCCGGCCAC GAGTTGTGCC 
11941 GTCCGCAGTT CGGTGGCGAT CCGCGCCCGC GCCGCCGTAA GGATCTCCTC GGACGTGTCG 
12001 ACGAGTGTCA CCGGGACGCC GTGGCGCAGC GCGAGCGTGG TGATGCCGGT GCCCATCACT 
12061 CCCGCGCCGA GCACGATCAG CTGGTGGTCC ACGCTGTTTC CTCCCTCCGG GGTCACCATG 

20 12121 GCAGCGAGTA CGGGTCGAGG ACGTCTTCCG GGGTCGACCC GATCGCGTCC TTGCGGCCGA 
12181 GGCCGAGTTC GTCGGCGAAG CCGAGCAGCA CGTCGAACGC GATGTGGTCG GCGAACGCGC 
12241 TGCCCGTCGA GTCGAGGACG CTCAGGCTGT CCCGGTGGTC CGCCGCGGTG TCCGGTGCCG 
12301 CGCACAGGGC CGCCAGCGAC GGGCCGAGCT CGCGGTCCGG CAGTTGCTGG TACTCGCCCT 
12361 CGGCGCGGGC CTGCCCCGGA TGGTCGACGC AGATGAACGC GTCGTCGAGC AGGGTCTTCG 

25 124 21 GCAGTTCGGT CTTGCCCGGC TCGTCGGCGC CGATGGCGTT CACATGCAGG TGCGGCAGCC 
124 81 GCGGCTCGGC GGGCAGCACC GGCCCTTTGC CCGAGGGCAC CGAGGTGACG GTGGACAGGA 
12541 CATCCGCGGC GGCGGCGGCC TCCGCCGGAT CGGTCACCTT GACCGGCAGT CCGAGGAACG 
12 601 CGATGCGGTC CGCGAACGAC GCCGCGTGGC CGGGGTCGGT GTCGCTGACC AGGATCCGCT 
12661 CGATGGGCAG GACCCTGCTG AGCGCGTGCG CCTGGGTCAC CGCCTGTGCG CCCGCGCCGA 

30 12721 TCAGCGTGAG CGTGGCGCTG TCGGACCGGG CCAGCAGCCG GCTCGCGACG GCGGCGACCG 
12781 CGCCGGTCCG CATCGCGGTG ATCACGCCTG CGTCGGCGAG GGCGGTCAGA CTGCCGCTGT 
12841 CGTCGTCGAG GCGCGACATC GTGCCGACGA TCGTCGGCAG CCGGAAGCGC GGATAGTTGT 
12 901 GCGGACTGTA CGAAACCGTC TTCATGGTCA CGCCGACACC GGGGACCCGG TACGGCATGA 
12 961 ACTCGATGAC GCCGGGAATG TCGCCGCCGC GGACGAATCC GGTACGCGGC GGCGCCTCGG 

35 13021 CGAACTCGCC GCGGCCGAGC GCGGCGAACC CGTCGTGCAG CTCGCTGATC AGCCGGTCCA 
13081 TCATCACGTC GCGGCCGATC ACGGAGAGAA TCCGCTTGAT GTCACGTTGG CGCAGGACCC 
13141 TGGTCTGCAT GTGTCACCTC CCTTTCGTGG CCGGAGCTGT CTTGGTGGTG CCGCTCGGGG 
13201 CGGCTTCCGT TCTCATCGCA GCTCCCTGTC GATGAGGTCG AAAATCTCGT CCGCGGTCGC 
132 61 GTCCGCGGAC AGCACGCCGG CCGGCGTGGT CGGGCGGGTC TCCCGCCGCC AGCGGTTGAG 
'40 13321 CAGGGCGTCC AGCCGGGTTC CGATCGCGTC CGCCTGGCGG GCGCCCGGGT CGACACCGGC 
13381 AACGAGTGCT TCCAGCCGGT CGAGCTGCGC GAGCACCACG GTCACCGGGT CGTCCGGGGA 
13441 CAGCAGTTCA CCGATGCGGT CGGCGAGTGC GCGCGGCGAC GGGTAGTCGA AGACGAGCGT 
13501 GGCGGACAGT CGCAGACCGG TCGCCTCGTT GAGGCCGTTG CGCAGCTGCA CCGCGATGAG 
13561 CGAGTCCACA CCGAGTTCCC GGAACGCCGC GTCCTCCGGG ATGTCCTCCG GGTCGGCGTG 

45 13621 GCCCAGGACG GCCGCTGCCT TCTGCCGGAC GAGGGCGAGC AGGTCGGTGG GGCGTTCCTG 
13681 CTCGTTGCGG GCGCTCCGGC GGGCCGACGG CTTGGGCCGG CCACGCAGCA GCGGGAGGTC 
13741 CGGCGGCAGG TCGCCCGCCA CGGCGACGAC ACTGCCCGTT CCGGTGTGGA CGGCGGCGTC 
13801 GTACATGCGC ATGCCCTGTT CGGCGGTGAG CGCGCTCGCC CCACCCTTGC GCATACGGCG 
13861 CCGGTCGGCG TCGGTCAGGT CCGCGGTCAG GCCACTCGCC TGGTCCCACA GCCCCCACGC 

50 13921 GATCGACAGC CCTGGCAGCC CTTGTGCACG CCGGTGTTCG GCGAGCGCGT CGAGGAACGC 
13981 GTTCGCCGCC GCGTAGTTGC CCTGACCGGG GGTGCCCAGC ACACCGGCCG CCGACGAGTA 
14 041 GACGACGAAT GCGGCGAGGT CGGTGTCGCG GGTGAGCCGG TGCAGGTGCC AGGCGGCGTC 
14101 GGCCTTGGGT TTGAGGACGG TGTCGATGCG GTCGGGGGTG AGGTTGTCGA GCAGGGCGTC 
14161 GTCGAGGGTT CCGGCGGTGT GGAAGACGGC GGTGAGGGGT TGAGGGATGT GGGCGAGGGT 
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14 221 GGTGGCGAGT TGGTGGGGGT CGCCGACGTC 
14 281 GGGGGGTGGG GTGCGGGAGA GGAGGTAGGT 
14 341 GCCGGCGAGG GTGCCGGAGC CGCCGGTGAT 
14 401 CGGGACCGTG AGGACGATCT TGCCGGTGTG 
14 4 61 GCGGACCTGC CGCATGTCGT GCACCGTCAC 
14 521 CAGGCCGAGC AGCTCCGCGA TGATCTCCTT 
14 581 GAACGGTCGC TGGACGGCGT GCCGGATGTC 
14 641 CGGCGCGAGC AGGCCGACGG ACGCGTCGAG 
14 701 GTCGACCGGC GGGAACGCGT CGGCGAACGC 
14 761 GTCCAGGTCC ACCAGATGGC GCTTCGCGGC 
14 821 GTGCCGCGCG ATCTGCCGGG CGGCGGAACC 
14 881 CTTCTCGCCG GGGCGCAGCC CGGCGAGGTC 

14 941 GGTCATCACG GACGCCGCCT GCGGGAACGT 
15001 GTGGTCGGCG ATGACCGTGG GGCCGAAGCC 
15061 CGGTGCCAGA CCGGAGACGT CGGCGCCGGT 
15121 GAGCACGCCC TGACCGGGGT AGGTGCCGAG 
15181 CGCCGCACGC ACACCGATCC GGACCTCGGC 
15241 GTCGGCCGCG GTGAGGCCGT CGAGGGTGCC 
15301 GCTGTCCGGC ACGGTGAGCG GCTCCGGCAC 
15361 GCCGCGCAGC CGCAGACGCG GCTCGCCGAG 
15421 GAGCGTGACG CCGGACTCGG TCTCGACGTG 
15481 GGCGCGCAGC AGTCCGGCCG CCGCGCCGGT 
15541 ATCCCCGCCG GAGCCGGTCA GGGCGGTCAG 

15 601 CACCGGGTCG TCGCCATCAG CGGCAGGCAA 
15661 ATCCGTGGGT GCGGCGACCT CGATCCAGGT 
15721 GGACAGCGGG CGGGTGCGGA CCGTCCGGAT 
15781 GACGCGCAGA CTCAGCTCGT CGCCGTCACG 
15841 CGTGGCGACG AACCGGGCCC CCTTCCAGGC 
15901 CGTGGTGAGG GCGACGGCGT GCAGGGCCGC 
15961 GTCCGCCTCG GCGGCCTGCT CGTCGGGCAG 
16021 ACGCCAGGCA GCCCGCAACC CCTGGAACGC 
16081 TTCGTCATAG AACCCCGAGA CGTCGACGGC 
16141 CGGCTCCACA CCGACAACAC CGGGGGTGTC 
16201 GTGCCGGGTC CAGCTGCCCG TGCCCTCGGT 
16261 GGCCTCATCA GCCCCTTCCA CGGTCACCGA 
16321 AAGGGGGGAT TCGATGACCA GCTCGTCCAC 
16381 GATGACCAGC TCCACAAACG CCGTACCCGG 
16441 AGCCAGCCAG GGGTGAGTGC GCAATGAGAT 
16501 GGCGGGCAGC GCTGTGACAG CGGCCAGCAT 
16561 CGACAGATCG GTGGCACCGG CCGCCTCCAG 
16621 GGGCAGATCC AGCAGCCGTC CCGGCACCGG 
16681 GCCCAGGGTC CACGCCTGCG CCAACGCCGT 
16741 CCGCAACGAC GCCACCGTGT GAGCCTGCTC 
16801 GCACTCCACG AACACCGACC CATCCAGCTC 
16861 ACGCAGATTC CGGTACCAGT ACCCCTCATC 
16921 GGTCGACCAC CACGCCACCG ACGCGGCCTT 
16981 TTCATCCTCG ATGGCTTCCA CGTGGGGCGT 
17 041 CACCCGCACG CCTTCGGCCT CATACCGCGC 
17101 CGCCACCACC GTCGAAGCCG GGCCGTTACG 
17161 GACCTCACCG GCCGGCAACG CCACCGAAGC 
17 221 GATGACCTGA CTGCGCAATG CCACCACGCG 
17281 CACGCACGCC GCCGCGATCT CGCCCTGGGA 
17 341 ATGCGCCTGC CACAGCGCGG CCAGGCTCAC 
174 01 CTCCACCCGC TCCGCCACAT CCGGCCGCGC 



GCAGGGGAGG TGGGTGCCGG GGGTGGTGTC 
GTGGGGGTGG TTCAGGTGGC GGGCGAGGAT 
GACGACGGCC CCCTCGGGGT CCAGCGGCCG 
CTCGCCGCGG CTCATGGTCG CCAGCGCCTC 
CGGCAGCGGG TGCAGCACAC CGCGCGCGAA 
GAGCCGGTCG GGCCCCGCGT CCATCAGGTC 
CGTCTTCCCC ATCTCGATGA ACCGGCCACC 
GAGTTCACCG GTGAGCGAGT TGAGCACGAC 
GGTGCTGCGG GAATCGGCCA GATGCGCTCC 
GCTGGTGGTC GCGTACACCT CCGCGCCCAG 
GACACCGCCG GTGGCCGCGT GGATCAGGAC 
GACCAGGCCG TACCACGCGG TCGCGAACGC 
CCAGCCGTCC GGCATCCGGC CGAGCATCCG 
GGTGCCGACG AGGCCGAAGA CGCGGTCGCC 
CTCCAGGACG ATGCCCGCGG CCTCGCCGCC 
CGCGATCAGC ACATCGCGGA AGTTGAGGCC 
CGGGGCGAGG GGGCGCCGGG GCTCCGCCGA 
CGTCCGCGCC GGCCGGATCA GCCACGTGTC 
CCGGGTGAGG CGGGCCGCCT CGAACCGGCC 
TGCGACGGCG ATGCGCTGCT GCTCGGGGGC 
GACGAACCGG CCGGGCTGCT CGGCCTGGGC 
GGCGAGGCCC GCGGTGGTGT GCACGAGCAG 
CAGCCGGGTG GTGAGCGCAC GCGTCTCGGC 
CGTGATGACG TCCACGTCGG TCGCGGGGAC 
GAGACGCATC AGGCCGGTGC CGACGGGTGG 
CTCGGCGACG AGTTGGCCGG CGGAGTCGGC 
AGTGATCACG GCTCGGAGCA TGGCCGAGCC 
GAACGGCAGA CCCGCAGCGC TGTCGTCCGG 
GTCGAGCAGC GCCGGATGCA CACCGAAACC 
CGCCACCTCG GCATACACGG TGTCACCATC 
CGACCCGTAC TCATAACCGG CATCCCGCAG 
CACGGCCGTG ACCGGCGGCC ACTGCGAGAA 
GGGGGTGTCG GGGGTCAGGG TGCCGCTGGC 
ACGCGCGTGG ACGGTCACCG GCCGCCGTCC 
CACATCCACC GCTGCGGTCA CCGGCACCAC 
TATCCCGCAA CCGGTCTCGT CACCGGCCCG 
CAGCAGGACC GTGCCCCGCA CCGCGTGATC 
CCGGCCAGTG AGAACAACAC CACCATCGTC 
CGGATGCGCC GCACCCGTCA ACCCCGCCGC 
CCAGTACCGC CTGTGCTCGA ACGCGTACGT 
TTCGACCACC GTGTCCCAGT CCACTGCCGT 
CAGCCACCGC TCCCAGCCGC CGTCACCGGT 
CATCGCCGGC AGCAGCACCG GATGGGCACT 
CGCCACCGCC GCGTCCAACG CCACCGGACG 
CACCGGCTCC GTCACCCAGG CGCTGTCCAC 
CCCTGCCACC CCCTCCAGTA CCTTGGCCAG 
GTGGGAGGCG TAGTCGACCG CGATACGACG 
CACCACCTCC TCCACCGCCG ACGGGTCCCC 
CGCCGCGATC CACACACCCT CGACCAGACC 
CATCGCTCCC CGCCCGGCCA GTCGCGCCGC 
GGCGGCGTCC TCGAGGCTGA GGGCTCCGGC 
GTGTCCGATC ACCGCGTCCG GCACGACCCC 
CGCGACCGCC CAGCTGGCCG GCTGGACCAC 
CAACATCTCC CGCACATCCC AGCCCGTGTG 



dc- 176500 



J PATENT 

AttyDkt: 300622002600 



-34- 

17 4 61 CGGCAGCAAC GCCTGAGCGC ACTCCTCCAT 
17 521 GAGTTCCACG CCCATGCCGA CCCACTGGGC 
17 581 CGGCTGGTCC ACCGCCACAC CCGTCACCCG 
17 641 GAAGACAGCA CGCTCCCGCA CCAACCCCTG 
17701 GCGCAGATAC CCCTCCAGCC GCTCCACCTG 
177 61 CACCGGCAAC GGCACCAACC CGTCAACAAC 
17 821 CTCAAGGATC ACGTGCGCGT TCGTACCGCT 
17881 TGCCCGATCC GACTCGGGCC ACGGCCTCGC 

17 941 CCAGTCCACA TGCGACGACG GCTCGTCCAC 
18001 CATCGCCATG ACCATCTTGA TCACACCGGC 
18061 GTTCGACTTC AACGAACCCA GCAGCAGCGG 
18121 AATGGCCTGC GCCTCGATGG GATCGCCCAG 
18181 GTCCACATCG GCGGCGCGCA GTCCGGCGTT 
18241 GGACGGGCCG TTGGGGGCGG ACAGCCCGTT 
18301 GCGGACGACC GCGAGAACGG TGTGTCCGTT 
18361 AAGAACGCCG GCGCCCTCCG CCCAGCCGGT 
18421 GCGGCCGTCG GGGGAGAGTC CGCCCTGCTG 
184 81 CATGACGGTG ACACCGCCGA CCAGCGCCAG 
18541 GGCCTGGTGC AGCGCGACCA GCGACGACGA 

18 601 CTGGAGCCCA TAGAAGTACG AGATCCGGCC 
18 661 GCCGAACCCG TCCAGGTCCG CGCCGACGCC 
18721 GCCGGTGTCG CTGCCGCGCA GTGTGCCCGG 
187 81 TGTCGTTTCC AGCAGGATCC GCTGCTGGGG 
18841 GCCGAAGAAC GCGGCATCGA AGCCGGCGGC 
18901 CGATCCGCCG GTGAGGCCGG ACGGGTCCCA 
18 961 GTCGCCGCCA CTGTCCACCA TGCGCCACAG 
19021 TCGGCAGGCC ATGCCCACGA TGGCCAGCGG 
19081 AGCGACCGGT GCGGCACCAC CGACCAGAGC 
19141 CGTCGGGTAG TCGAAGACAA GCGTGGCGGG 
19201 GTTCCGCAGT TCGACGGCGG TCAGCGAGTC 
19261 GGACACGTCC GCGGCGTCCG CGTGGCCGAG 
19321 CAGCAGCGCG GTGTCCCGCT CAGCGCCGGA 
19381 GGCGGTGGCC GCCGCCGGGC GCGATACGGC 
19441 GTGCGCGGTG AGGTCCATCG TGGCCGCCAC 
19501 TTCCAGCAGG CGCATGCCCA CACCGGCCGA 
19561 GGTGCGGTTG GTGCCGCTCA TGCTGCCGGT 
19621 GGCCAGCGAC AGCGCGGGCA GTCCTTCGGC 
19681 CCCGTTCGCC GCCGAGTAGT TGCCCTGGCC 
19741 GTAGAGGACG AACGAGCGCA GGTCCGCGTC 
19801 GTCGGCTTTG GGGCGCAGTG TGGTGGCGAG 
198 61 GTCGTCGAGC ACGGCTGCCG TGTGGAAGAC 
19921 CGCGGCGGCG AGCTGGTCCC GGTCGGCGAC 
19981 CGCCGGCGGT TCGCTGCGCG ACAGCAACAG 
20041 ATGCCGGGCG AGGAGACCTG CCAGCACACC 
20101 CGGGTCGAGC AGCGGTTCGG GCGTTTCCGC 
20161 GTACCGGCCG TCGGTGACGC GGACGTACGG 
20221 CTCGATGGGG GTGTCGGTGC CGGTCTCCAC 
20281 GGCGGACCGG ACGAGGCCGG CGACCGCTCC 
20341 GAGGGTGGTC TCCGCAGGGC CGTCCTCGGC 
204 01 CTCGGTGAGC CGGTACGTCT CGTCGAGGAC 
204 61 GATGTGGACC GCGTCCGCAG GACCGGGCCC 
20521 GTACAAGGAG' TTCCGTACGA CGGCGGCGTC 
20581 CGCGGCGACG GTCACCACCG GTTGGCCGAC 
20641 CGGGCCCTGA GTGATCGTGA CGCGCAGCGT 



ACGCGCGGCG AACACCGCGG AGTGGGCCAT 
GCCCTGGCCG GGGAAGACGA ACACCGTACG 
GGCATCGCCC AGCAGCACCG CACGGTGACC 
CGCGACCGCG GCCACATCCA CACCACCCCC 
CCCCCGCAGA CTCACCTCAC CACGAGCCGA 
CGACTCCCCA CGCGACGGCC CAGGAACACC 
CACCCCGAAC GACGACACAC CCGCATGCGG 
CTCGGTGAGC AGCTCCACCG CACCGGCCGA 
ATGCAGCGTC TTCGGCGCGA TCCCGTACCG 
GACACCCGCC GCCGCCTGCG CATGACCGAT 
AACCTCACGC TCCTGCCCGT ACGTCGCCAG 
CGTCGTCCCC GTCCCGTGCG CCTCCACCAC 
CACCAACGCC TGCTGGATGA CACGCTGCTG 
GGAGGCACCG TCCTGGTTCA CCGCCGACCC 
GCGCTCGGCG TCGGAGAGCC GCTCCAGCAC 
GCCGTTGGCG GCGTCCGCGA ACGCGCGGCA 
CTGGAATTCC ACGAACCCGG TCGGGGTCGC 
CGAGCACTCC CCGTGGCGCA GTGCGTGCCC 
GCACGCCGTG TCCACCGTGA ACGCCGGTCC 
GGTGAGCACG CTGGGCTGCA TGCCGATCGA 
GTACCCGTAC GAGAAGGCGC CCATGAACAC 
CACGATGCCC GCGCTCTCGA ACGCCTCCCA 
GTCCATGGCC CGTGCCTCAC GGGGGCTGAT 
GTCGGAGAGG AAGCCGCCGC GGTCCGTGTC 
GCCACGGTCG GCCGGGAAGC CGGTGACCGC 
GTCGTCGGGC GAGGTGACGC CGCCCGGCAG 
TTCGTCACGG GTCGCGGCGG CTGTGGGAAC 
CTCGTCCAAC CGCGACGCGA TGGCCCGCGG 
CAGTCGGACA CCGGTCGCCG CGGCGAGTCG 
GATACCCAGT TCCTTGAAGG CCGCGTCCGC 
CACCGCCGCC GCGTTGTCGC GGACCAGTGC 
CATGGTGCCG AGCCGGTCGG CGAGCGGAAC 
GCGGCGCAGA TCGGCGAAAA GCGGCGATGT 
GGCGAACGCG GTGCCGGTTC CGGCCGCGGC 
CATGGGGCGG AAACCGCCGC GGCGGACACG 
GAGTCCGCTG TCATCGGCCC AGAGGCCCCA 
ATGGCGCAGC GTCGCGAGTC CGTCGAGGAA 
GCGGCCGCCC ATGATGCCCG CGACGGACGA 
CCGGGTCAGC TCGTGCAGGT GCCAGGCGCC 
CCGCTCCGGG GTGAGTGCCG TGGTCACGCC 
CGCCGTGAGC GGCCTGCCGG CGGCGGCGAG 
GTCACAGCGG ATGTGGACAC CGGGAGTGTC 
GAGGTGGCGG GCGCCATGCT CGGCGACGAG 
CGAGCCGCCG GTGATGACCA CCGTGCCGTC 
GGCGGCCGTG CGGGTGAACC GCGGCGCTTC 
CTCGGCCAGT GTCGTGGCGG CGGCCAGCGC 
CAGCACGAAC CGGCCCGGGT GCTCGGCCTG 
TCCGACCGGT CCCGCGTCGA TCCGGACGAC 
GATCACCCGG TGCAGCTCGC CGAGCACGAA 
ATCCGCGCCC GGTTCCGGGA GCGCGGAGAC 
GGGAGTGGGC AGCTCGGTCC AGGAGAGGCC 
GCCGTCGACG TTCACCGGTC GCGCGGTCAG 
CGGGTCCGTC GCATGCACGG CAGCGCCGTC 
GGTGGCCCCG GTCGTGTGGA ACCGCACGCC 
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20701 GCTCCACGAG AACGGCAGCC GCACCTCCGC TTCCTGTTCC GCGAGCAGCG GCAGGCAGGT 
20761 GACGTGCAAG GCCGCGTCGA ACAGCGCCGG GTGGACGCCA TAGTGCGGCG TGTCGTCCGC 
20821 CTGTTCCCCG GCGATCTCCA CCTCGGCGTA CAGGGTTTCG CCGTCGCGCC AGGCGGTGCG 
20881 CAGTCCCTGG AACGCTGGGC CGTAGCTGTA GCCGGTCTCG GCCAGCCGCT CGTAGAACGC 
5 20941 GCTCACGTCG ACGCGTCGCG CGCCCGGCGG CGGCCACGCG GGCGGCGGGA CCGCCGCGAC 
21001 GCTTCCGGCC CGGCCGAGGG TGCCGCTGGC GTGCCGGGTC CAGCTGTCCG TGCCCTCGGT 
21061 ACGCGCGTGG ACGGTCACTC GCCGCCGTCC GGCCTCATCG GCCCCTTCGA CGGTCACCGA 
21121 CACATCCACC GCGCCGGTCA CCGGCACCAC GAGCGGGGTC TCGATGACCA GTTCATCCAC 
21181 CACCCCGCAA CCGGTCTCGT CACCGGCCCG GATGACCAGC TCCACAAACG CCGTACCCGG 

10 21241 CAGCAGAACC GTGCCCCGCA CCGCGTGATC AGCCAGCCAG GGATGCGTAC GCAACGAGAT 
21301 CCGGCCAGTG AGAACAACAC CACCACCGTC GTCGGCGGGC AGTGCTGTGA CGGCGGCCAG 
21361 CATCGGATGC GCCGCCCCGG TCAGCCCGGC CGCGGACAGA TCGGTGGCAC CGGCCGCCTC 
214 21 CAGCCAGTAC CGCCTGTGCT CGAACGCGTA GGTGGGCAGA TCGAGCAGCC GTCCCGGCAC 
214 81 CGGTTCGACC ACCGTGTCCC AGTCCACTGC CGTGCCCAGG GTCCACGCCT GCGCCAACGC 

15 21541 CGTCAGCCAC CGCTCCCAGC CGCCGTCACC GGTCCGCAAC GACGCCACCG TGTGAGCCTG 
21601 TTCCATCGCC GGCAGCAGCA CCGGATGGGC GCTGCACTCC ACGAACACGG ACCCGTCCAG 
21661 CTCCGCCACC GCCGCGTCCA GCGCGACGGG GCGACGCAGG TTCCGGTACC AGTAGCCCTC 
21721 ATCCACCGGC TCGGTCACCC AGGCGCTGTC CACCGTGGAC CACCAGGCCA CCGACCCGGT 
217 81 CCCGCCGGAA ATCCCCTCCA GTACCTCGGC CAACTCGTCC TCGATGGCTT CCACGTGGGG 

20 21841 CGTGTGGGAG GCGTAGTCGA CCGCGATACG GCGCACTCGC ACGCCTTCGG CCTCGTACCG 
21901 CGTCACCACT TCTTCCACCG CGGACGGGTC CCCCGCCACC ACAGTCGAAG ACGGGCCGTT 
21961 ACGCGCCGCG ATCCACACGC CCTCGACCAG GTCCACCTCA CCGGCCGGCA ACGCCACCGA 
22021 AGCCATCGCC CCCCGCCCGG CCAGCCGCCC GGCGATCACC TGGCTGCGCA AGGCCACCAC 
22081 GCGGGCGGCG TCCTCAAGGC TGAGGGCTCC GGCCACACAC GCCGCCGCGA TCTCGCCCTG 

25 22141 GGAGTGTCCG ACCACCGCGT CCGGCACGAC CCCATGCGCC TGCCACAGCG CGGCCAGGCT 
22201 CACCGCGACC GCCCAGCTGG CCGGCTGGAC CACCTCCACC CGCTCCGCCA CATCCGGCCG 
222 61 CGCCAACATC TCCCGCACAT CCCAGCCCGT GTGCGGCAAC AACGCCCGCG CACACTCCTC 
22321 CATACGAGCC GCGAACACCG CAGAACACGC CATCAACTCC ACACCCATGC CCACCCACTG 
22381 AGCACCCTGC CCGGGAAAGA CGAACACCGT ACGCGGCTGA TCCACCGCCA CACCCATCAC 

30 224 41 CCGGGCATCG CCCAACAACA CCGCACGGTG ACCGAAGACA GCACGCTCAC GCACCAACCC 
22501 CTGCGCGACC GCGGCCACAT CCACACCACC CCCGCGCAGA TACCCCTCCA GCCGCTCCAC 
22561 CTGCCCCCGC AGACTCACCT CACTCCGAGC CGACACCGGC AACGGCACCA ACCCATCGAC 
22621 AGCCGACTCC CCACGCGACG GCCCGGGAAC ACCCTCAAGG ATCACGTGCG CGTTCGTACC 
22 681 GCTCACCCCG AAAGCGGAGA CACCGGCCCG GCGCGGACGT CCCGCGTCGG GCCACGCCCG 

35 22741 CGCCTCGGTG AGCAGTTCCA CCGCGCCCTC GGTCCAGTCC ACATGCGACG ACGGCTCGTC 
22801 CACATGCAGC GTCTTCGGCG CGATGCCATA CCGCATCGCC ATGACCATCT TGATGACACC 
228 61 GGCGACACCC GCAGCCGCCT GCGCATGACC GATGTTCGAC TTCAACGAAC CCAGCAGCAG 
22921 CGGAACCTCA CGCTCCTGCC CGTACGTCGC CAGAATCGCG TGCGCCTCGA TGGGATCGCC 
22 981 CAGCGTCGTC CCCGTCCCGT GCGCCTCCAC CACGTCCACG TCGGCGGGGG CGAGCCCCGC 

40 23041 CTTGTGGAGG GCCTGGCGGA TGACGCGCTG CTGGGAGGGG CCGTTGGGTG CGGAGATGCC 
23101 GTTGGAGGCG CCGTCCTGGT TGACGGCGGA GGAGCGGACG ACCGCGAGGA CGGTGTGTCC 
23161 GTTGCGCTCG GCGTCGGAGA GCTTTTCGAC GACGAGGACG CCGGCCCCCT CGGCGAAACC 
23221 GGTGCCGTCC GCCGCGTCAG CGAACGCCTT GCACCGTCCG TCCGGCGCGA CGCCGCCCTG 
23281 CCGGGAGAAC TCCACGAAGG TCTGTGGTGA TGCCATCACT GTGACACCAC CGACCAGCGC 

45 23341 CAGCGAGCAC TCCCCGGTCC GCAGCGCCTG CCCGGCCTGG TGCAGCGCGA CCAGCGACGA 
23401 CGAACACGCC GTGTCGACCG TGACCGCCGG ACCCTCCATG CCGAAGAAGT ACGACAGCCG 
234 61 TCCGGCGAGC ACCGCGGGCT GTGTGCTGTA GGCGCCGAAT CCGCCCAGGT CCGCGCCCGT 
23521 GCCGTAGCCG TAGTAGAAGC CGCCGACGAA GACGCCGGTG TCGCTGCCGC GCAGGGTGTC 
23581 CGGCACGATG CCGGCGTGTT CGAGCGCCTC CCAGGCGATT TCGAGGAGGA TCCGCTGCTG 

50 23641 CGGGTCGAGT GCGGTGGCCT CGCGCGGACT GATGCCGAAG AACGCGGCAT CGAAGTCGGC 
237 01 GGCGCCCGCG AGTGCGCCGG CCCGCCCGGT GGCGGACTCG GCGGCGGCGT GCAGCGCGGC 

237 61 CACGTCCCAG CCGCGGTCGG TGGGGAAGTC GCCGATCGCG TCGCGGCCGT CCGCGACGAG 
23821 CTGCCACAGC TCTTCCGGTG AGGTGACGCC GCCCGGCAGT CGGCAGGCCA TGCCGACGAC 

238 81 GGCGAGCGGC TCGTTCGCCG CGGCGCGCAG CGCGGTGTTC TCCCGGCGGA GCTGCGCGTT 
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23941 GTCCTTGACC GACGTCCGCA GCGCCTCGAT CAGGTCGTTC TCGGCCATCG CCTCATCCCT 
24 001 TCAGCACGTG CGCGATGAGC GCGTCTGCGT CCATGTCGTC GAACAGTTCG TCGTCCGGCT 
24 061 CCGCGGTCGT GGTGCTCGCG GGTGCCTGTG CCGGTGGTTC ACCGCCGTCC GGGGTCCCGT 
24121 TGTCGTCCGG GGTCCCGTTG ACGTCCGGGG CCAGGAGGGT CAGCAGATGA CGGGTGAGCG 
5 24181 CGCCGGCGGC GGGATAGTCG AAGACGAGCG TGGCCGGCAG CGGAATGCCG AGGGCCTCGG 
24 241 AGAGCCGGTT GCGCAGGCCG AGCGCGGTGA GCGAGTCGAC CCCGAGGTCC TTGAACGCCG 
24 301 TGGTGGCCGT GACCGCCGCC GCGTCGGTGT GGCCCAGCAG GGTGGCGGCG GTGTCGCGGA 
24 361 CGACGCCGAG CAGCACCTGT TCCCGTTCCT TGTGGGGCAG GTCCGGCAGG CGTTCCAGCA 
24 4 21 GGGAGCCGCC GTCGGTCGCG GAGCGCCGGG TGGGGCGCTG GATCGGTCGC CACAGCGGTG 
10 24 4 81 ACGGGTCGCC GGGCCCGGGT GGGGCGGTCG CCACGACCAC GGCTTCCCCG GTGGCGCACG 
24 541 CGGCGTCGAG GAGGTCGGTC AGCCGGTCCG CCGCGGCGGT GAACGCCACG GCCGGCAGGC 
24 601 CTTGTGCCCG GCGCAGGTCG GCCAGGGCCT GGAGCGGTCC GGCCGCCTCG CCGGACGGAA 
24 661 CGGCGAGAAC GAACGCGGTC AGGTCGAGGT CGCGGGTCAG GCGGTGCAGT TCCCAGGCCG 
24 721 ACTCGGCGGT GCCGTCCGCG TGGACGACCG CGGTCACCGG GGTTTCCGGC ACTGTGCCCG 
15 24 781 GCTCGTACCG GATCACTTCG GCGCCGTGTC CGCCGAGGTG TCCGGCGAGT TCCTCCGAAC 
24 841 CGCCCGCGAG GAGGACGGTG TCGCCGTACG AGGCCGCGGC CGTGGTGGGC GCGGCGGGGA 
24901 CGAGGCGGGG CGCTTCGAGG CGCCCGTCGG CCAGGCGCAG GTGCGGTTCG TCGAGGCGGG 
24 961 AGAGGGCGGC GGCGCGGCGG GGGGTGACCG TGTCGGTGGT CTCCACGAGC ACGAGCCGGC 
25021 CCGGTTCCGC GGTGTCGAGC AGTGCGGCGA CGGCACCGGC GACGGGCCCG GCCTCGGCGG 
p 20 25081 ACACCACCAG CGTGGCGCCG GCGGTCCTCG GGTCGTCCAG TGCGGTACGG ACCTCGTCGG 

25141 GACCGGATAC CGGGACGACG ATGACGTCGG GCGTGGCGTC GTCGCCGAGG TCGGTGTACC 
25201 GGCGGGCCGT GGTGCCGGGT GCCGCCGGGG CCCGGACGCC GGTCCAGGTG CGCCGGAACA 
25261 GCCGCACGTC CCCGTCCGGG CCCGTCGTGG CGGGGGGCCG GGTGATGAGC GAGCCGATCT 
25321 GAGCCACCGG CCGTCCCAGT TCGTCGGCGA GGTGCACGCG GGCGCCGCCC TCGCCCTCGC 
j;2 25 25381 CGTGGACGAA GGTGACGCGC AGTTTCGTGG CGCCGCTGGT GTGGACACGG ACGCCGGTGA 

S= 25441 ACGCGAACGG CAACCGTACC CCCGCGTTCT CGGCGGCCGC GCCGATGCTG CCCGCTTGCA 

ffJ 25501 GCGCGGTGAC GAGCAGCGCC GGGTGCAGTG TGTAGCGGGC GGCGTCCCTG GCGAGGGCGC 

M= 25561 CGTCGAGGGC GACTTCGGCG CAGACGGTGT CTCCGTGGCT CCACGCGGCG GACATGCCGC 

s 25621 GGAACTCGGG GCCGAACTCG TATCCCGCGT CGTCGAGTCG CTGGTAGAAG GCCGCGACGT 

p 30 25681 CGACCGGTTC CGCGTGCTCG GGCGGCCAGG GCCCCGGCGT GGTGGCCGGT TCGGTGGTGG 

% 25741 CGATGCCGGC GAAGCCGGAG GCGTGGCGGG TCCATGTCCG GTCGCCGTCC GTCCGGGCGT 

25801 GGACGCGCAC GGCACGGCGT CCGGTGTCGT CGGGCGCGGC GACGGTCACG CGCACCTGGA 
25861 CGGCGCCGGT GGCGGGCAGG ACCAGCGGTG TCTCGACGAC CAGTTCGTCG AGCAGGTCGC 
25921 AGCCTGCCTC GTCGGCGCCG CGTCCGGCCA ATTCCAGGAA GGCGGGTCCG GGCAGCAGTA 
P 35 25981 CGGCGCCGTC GACGGAGTGA CCGGCCAGCC ATGGGTGGGT GGCCAGCGAG AACCGGCCGG 

M 26041 TGAGCAGCAC CTCGTCGGAG TCGGGGAGCG CCACCGACGC GGCGAGCAGC GGGTGGTCGA 

26101 CGGCGTCGAG TCCGAGGCCG GAAGCGTCCG TGCCGGCCGC GGTCTCGATC CAGTAGCGCT 
26161 CATGGTGGAA GGCGTATGTG GGCAGGTCGT GTGCCGTCGC CGTCGCGGGG ACGACCGCCG 
2 6221 CCCAGTCGAC GGGCACGCCG GTTGTGTGCG CCTCGGCCAG CGCGGTGAGC AGCCGGTGGA 
40 2 6281 CTCCCCCGCC GCGGCGGAGC GTGGCGACGG TCGCGCCGTC GATCGCGGGC AGCAGCACGG 
2 6341 GGTGCGCGCT GACCTCGACG AACACGGTGT CACCCGGCTC GCGGGCAGCG GTCACGGCCG 
2 6401 TGGCGAAGCC TACGGGGTGG CGCATGTTGC GGAACCAGTA CTCGTCGTCG AGCGGCGCGT 
264 61 CGATCCAGCG TTCGTCGGCG GTGGAGAACC ACGGGATCTC GGGCGTGCGC GAGGTGGTGT 
2 6521 CCGCGACGAT CCGCTGGAGT TCGTCGTACA GCGGGTCGAC GAACGGGGTG TGGGTCGGGC 
45 2 6581 AGTCGACGGC GATGCGGCGC ACCCAGACGC CGCGGGCCTC GTAGTCGGCG ATCAGCGTTT 
26641 CGACGGCGTC CGGGCGCCCG GCGACGGTCG TGGTGGTGGC GCCGTTGCGG CCCGCGACCC 
26701 AGACGCCGTC GATCCGGGCG GCATCCGCCT CGACGTCGGC GGCCGGGAGC GCGACCGAGC 
26761 CCATCGCGCC GCGTCCGGCG AGTTCGCGCA GGAGCAGGAG AACGCTGCGC AGCGCGACGA 
26821 GGCGGGCACC GTCCTCCAGG GTGAGCGCTC CGGCGACACA GGCCGCGGCG ATCTCGCCCT 
50 2 6881 GGGAGTGTCC GATGACGGCG TCCGGGCGTA CGCCCGCGGC CTCCCACACG GCGGCCAGCG 
2 6941 ACACCATGAC GGCCCAGCAG ACGGGGTGCA CGACGTCGAC GCGGCGGGTC ACCTCCGGGT 
27001 CGTCGAGCAT GGCGATGGGG TCCCAGCCCG TGTGCGGGAT CAGCGCGTCG GCGCATTGGC 
27061 GCATCCTGGC GGCGAACACC GGGGAGGCCG CCATCAGTTC GACGCCCATG CCGCGCCACT 
27121 GCGGTCCTTG TCCGGGGAAG ACGAAGACGG TGCGCGGCTC GGTGAGCGCC GTGCCGGTGA 
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27181 CGACGTCGTC GTCGAGCAGC ACGGCGCGGT GCGGGAACGT CGTACGCCTG GCGAGCAGGC 
27241 CCGCGGCGAT GGCGCGCGGG TCGTGGCCGG GACGGGCGGC GAGGTGCTCG CGGAGTCGGC 
27301 GGACCTGGCC GTCGAGGGCC GTGGCGGTCC GCGCCGAGAC GGGCAGTGGT GTGAGCGGCG 
27361 TGGCGATCAG CGGCTCACCG GGCTTCGAGG CCGACGGCTC CTCGGCCGGC GGCTCCCCGG 
27421 CCGGGTGGGC TTCCAGCAGG ACGTGGGCGT TGGTGCCGCT GACGCCGAAG GAGGACACAC 
274 81 CGGCGCGCCG CGGGCGGTCG GTCTCGGGCC AGGGCCGGGC ATCGGTGAGG AGTTCGACGG 
27 541 CGCCGGCCGT CCAGTCGACG TGCGAGGACG GCGTGTCCAC GTGCAGGGTG CGCGGCAGGG 
27 601 TGCCGTGCCG CATGGCGAGG ACCATCTTGA TGACACCGGC GACACCCGCG GCGGCCTGAG 
27 661 TGTGGCCGAT GTTGGACTTC AGCGAGCCCA GCAGCACCGG GGTGTCGCGC CCCTGCCCGT 
27 721 AGGTGGCCAG CACCGCCTGT GCCTCGATGG GATCGCCCAG CCTGGTGCCG GTGCCGTGCG 
27781 CCTCCACGGC GTCCACGTCC GCCGGGGTGA GCCCGGCGTT GGCCAGGGCC TGCCGGATCA 
278 41 CCCGCTCCTG CGAGGGCCCG TTCGGCGCCG ACAACCCGTT GGAAGCACCG TCCTGGTTGA 
27 901 CCGCCGAACC CCGGACAACC GCCAGCACAC GGTGGCCGTT GCGCTCGGCA TCGGAGAGCC 

27 961 TCTCGACGAT CAGCACACCG GACCCCTCGG CGAAACCGGT GCCGTCAGCC GCATCCGCGA 
28021 ACGCCTTGCA GCGCGCGTCG GGCGCGAGAC CCCGCTGCTG GGAGAACTCG ACGAAGCCGG 
28081 ACGGCGAGGC CATCACCGTG ACGCCGCCGA CCAGGGCGAG CGAGCATTCG CCGGAGCGCA 
28141 GTGACTGCCC GGCCTGGTGC AGCGCCACCA GCGACGACGA ACACGCCGTG TCGACCGTGA 
28201 CCGCCGGACC CTCCAGACCG TAGAAGTACG ACAGCCGACC GGACAGCACA CTGGTCTGGG 
282 61 TGCCGGTCGC GCCGAAACCG CCCAGGTCGG TGCCGAGTCC GTACCCGTCG GAGAAGGCGC. 
28321 CCATGAACAC GCCGGTGTCG CTTCCGCGCA GCGACTCCGG GAGGATCCCG GCGTGTTCCA 
28381 GCGCCTCCCA CGAGGTCTCC AGGACCAGAC GCTGCTGCGG GTCCATCGCC AGCGCCTCAC 
2 8441 GCGGACTGAT CCCGAAGAAC GCCGCGTCGA AGTCCGCCAC CCCGGCGAGG AAGCCACCAT 
28501 GACGCACGGT CGACGTGCCC GGATGATCCG GATCGGGATC GTACAGCCCG TCCACGTCCC 
28561 AACCACGGTC CGTCGGAAAC GCCGTGATCC CGTCACCACC CGACTCCAGC AGCCGCCACA 

28 621 AGTCCTCCGG CGACGCGACC' CCACCCGGCA GCCGGCAGGC' CATCCCCACG ATCGCCAACG 
28 681 GCTCGTCCTG CCGGACGGCC GCGGTCGTGG TGCGGGTCGG CGATGCCGTC CGGCCGGACA 

287 41 GCGCCGCGGT GAGCTTCGCC GCGACGGCGC GCGGCGTCGG GAAGTCGAAG ACCGCGGTGG 
28801 CGGGCAGCCG TACGCCCGTC GCCTCGGTGA AGGCGTTGCG CAGCCGGATC GCCATGAGCG 

288 61 AGTCGACGCC GAGTTCCTTG AACGTGGCGG TCGCCTCGAC CCGTGCGGCA CCGTCGTGGC 
28 921 CGAGTACGGC CGCGGTGCAC TGCCGGACGA CGGCGAGCAC GTCCTTTTCG GCGTCCGCGG 
28981 CGGAGAGCCG CGCGATCCGG TCGGCGAGGG TGGTGGCGCC GGCCGCCCGG CGCCGCGGCT 
29041 CCCGGCGCGG TGCGCGCAGC AGGGGCGAGC TGCCGAGGCC GGCCGGGTCG GCGGCGACCA 
2 9101 GCGCCGGGTC CGAGGACCGC AACGCCGCGT CGAACAGCGT CAGTCCGCCT TCGGCGGTCA 
2 9161 GCGCCGTCAC GCCGTCGCGG CGCATGCGGG CGCCGGTGCC GACCGTCAGC CCGCTCTCCG 
2 9221 GTTCCCACAG GCCCCAGGCC ACGGACAACG CGGGCAGTCC GGCTGCCCGG CGCTGTTCGG 
29281 CCAGCGCGTC GAGGAACGCG TTCGCGGCCG CGTAGTTGCC CTGTCCGGGG CTGCCGAGCA 
29341 CACCGGCGGC CGACGAGTAG AGGACGAACG CGGCCAGTTC CGTGTCCTGG GTGAGTTCGT 
2 9401 GCAGGTGCCA CGCGGCGTCC ACCTTCGGGC GCAGCACCGT CTCGAGCCGG TCGGGGGTGA 
2 94 61 GCGCGGTGAG GACGCCGTCG TCGAGGACGG CCGCGGTGTG CACGACGGCC GTGAGCGGGT 
29521 GCGCCGGGTC GATCCCCGCC AGTACGGAGG CGAGTTCGTC CCGGTCGGCG ACGTCGCAGG 
2 9581 CGATCGCCGT GACCTCGGCG CCGGGCACGT CGCTCGCCGT GCCGCTGCGC GACAGCATCA 
2 9641 GCAGCCGGCG CACGCCGTGG CGTTCGACGA GGTGGCGGCT GATGATGCCG GCCAGCGTCC 
29701 CGGAGCCACC GGTGACGAGC ACGGTGCCGT CCGGGTCGAG CGCCGGAGCG TCACCCGCCG 
2 97 61 GGACCGCCGG GGCCAGACGG CGGGCGTACA CCTGGCCGTC ACGCAGCACC ACCTGGGGCT 
2 9821 CATCGAGCGC GGTGGCCGCT GCGAGCAGCG GCTCGGCGGT GTCCGGGGCG GCGTCGACGA 
2 9881 GGACGATCCG GCCGGGGTGT TCGGCCTGCG CGGTCCGCAC CAGTCCGGCG GCCGCGGCCG 
29941 ACGCGAGACC GGGCCCGGTG TGGACGGCCA GGACCGCGTC GGCGTACCGG TCGTCGGTGA 
30001 GGAAGCGCTG CACGGCGGTC AGGACGCCGG CGCCCAGTTC GCGGGTGTCG TCGAGCGGGG 
30061 CACCGCCGCC GCCGTGCGCG GGGAGGATCA CCACGTCCGG GACCGTCGGG TCGTCGAGGC 
30121 GGCCGGTCGT CGCGGTCGTG GGCGGCAGCT CCGGGAGCTC GGCCAGCACC GGGCGCAGCA 
30181 GGCCCGGAAC GGCTCCCGTG ATCGTCAGGG GGCGCCTGCG CACGGCGCCG ATGGTGGCGA 
30241 CGGGCCCGCC GGTCTCGTCC GCGAGGTGTA CGCCGTCAGC GGTGACGGCG ACGCGTACCG 
30301 CCGTGGCGCC GGTGGCGTGG ACGCGGACGT CGTCGAACGC GTACGGAAGG TGGTCCCCTT 
30361 CCGCGGCGAG GCGGAGTGCG GCGCCGAGCA GCGCCGGGTG CAGGCCGTAC CGTCCGGCGT 
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304 21 CGGCGAGCTG TCCGTCGGCG AGGGCCACTT 
304 81 CGGCGCGCGG GCGGGGCAGC GCGGGCCCGT 
30541 CGATGTCGTC GGGGTCCACC GGCCGGGCCG 
30601 GCACGGCCGG GGCCGTCCGC GGGTCGGGGG 
30661 CCCCCGCCGC GTGCCGCGTG TGCACGGTGA 
30721 TCACCGTGAC GGAGAGCGCG AGCGCACCGG 
30781 TGAACGTGTC GAGGGCGCCG CAGCCGGCTT 
30841 GGGCCGCGGC GGGCAGCACC GCGAGGCCGT 
30901 CGACCCGGCC GGTGAGCACC AGGTCGCCGG 
30961 CCGGGTGCGC GACCGGCGTC TGTCCGGCCG 
31021 GCCAGTAGCG GACCCGCTCG AACGGGTACG 
31081 GGTCGATGAC CTTCGGCCAG TCGACCGTGA 
31141 TCAGGGCGGA TCGCGGTTCG TCGTCGGCGT 
31201 TCAGGCTCCG GTCCGGGCCG ATCTCCAGGA 
312 61 CCCCGAACCG GACGGTGTCG CGGACCTGTC 
31321 CGCCCGCGGC CATCGGGATC CTCGGCTCGT 
31381 ACTCCTCGAG CATCGGCTCC ATCCGCGCCG 
314 41 TGAAGCGGCC GAGCCGGGCC GCGACGTCGA 
31501 TCGACGCGGG CCCGTTGACC GCGGCGATCT 
31561 CCCGTTCCGA CGCGATCACG GCGGCCATCG 
31621 GGGCCCGTGC GGACACCAGC CTGCACGCGT 
31681 CGGCGGCCAG CTCGCCGATC GAATGGCCCA 

317 41 CGAGCTGTGC GCCGAGTGCG ACCTGGAGCG 
31801 CGTGGAGGTC GAGCCCGGCG GGCACGTCGA 

318 61 CGAAGACGTC GTAGGCGGCG GCCAGTCCGT 
31921 CGGAGAAGAG CCACACGAGG CGGCGGTCCG 
31981 CGATCAGCGC GGCCCGGTGC GGGAAGGCCG 
32041 GCTCGTCCTC CTCGCCGGTG GCGhGGTGGG 
32101 CCTGCGGGGT GCGTGCCGAG AGCAGCAGGG 
32161 GTTCGGGGGC CGGTCGGGGG TGGCTTTCGA 
32221 CGAAGGAGGA CACCCCGGCG CGCCGTGGGC 
32281 TGAGGAGTTC GACGGCGCCG GCCGTCCAGT 
32341 GGGTGCGCGG CAGGGTGCCG TGCCGCATGG 
32401 CCGCGGCGGC CTGAGTGTGG CCGATGTTGG 
324 61 CGCGATGCTG CCCGTAGGTG GCCAGTACCG 
32521 TCCCGGTGCC ATGCGCCTCG ACAGCGTCCA 
32581 GCGCCTGCCG GATCACCCGC TCCTGCGACG 
32641 CACCGTCCTG GTTGACCGCC GAACCACGCA 
327 01 CGGCGTCGGA GAGCCTCTCG ACGATCAGCA 
327 61 CAGCCGCATC CGCGAACGCC TTGCAGCGGC 
32821 AGTCCACGAA GCCGGACGGC GAGGCCATCA 
32881 ACTCCCCCGA GCGCAGCGAC TGCCCGGCCT 
32941 CCGTGTCCAC CGTGACCGCC GGACCCTCCA 
33001 GCACACTGGT CTGGGTGCTG GTGGCACCGA 
33061 CGTAGAAGTA GCCGCCCATG AACACGCCGG 
33121 TCCCGGCGTG TTCCAGCGCC TCCCACGAGG 
33181 TCGCCAGCGC CTCACGCGGA CTGATCCCGA 
33241 CGAGGAAGCC ACCATGACGC ACGGTCGACG 
33301 GCCCGTCCAC GTCCCAACCA CGGTCCGTCG 
33361 CCAGCAGCCG CCACAAGTCC TCCGGCGACG 
33421 CCACGATCGC CAACGGCTCG TCCTGCCGGA 
334 81 TGGCCCGCGC GCCGGCCAGT TCGTCCAGGT 
33541 CGAAGACGAG CGTAGCGGGC AGCGTCAGGC 
33601 CGACGCCGGT CAGCGAGTCG AAGCCCACTT 



CCGCCCAGAC GGCGTCGTCG TCGGCCCAGA 
CCGTGTACCC GGCTCGGGCC AGACGGTCGG 
TGGCGGGCGG CCACGTCGAC GGCATCTCCC 
CGAGGATTCC GTGCGCGTGC TCGGTCCACT 
CCGCGCGGCG GCCGTCCGCC CCGGGCGCGC 
ACCGCGGCAG CGTGAGGGGG GTGTCCACGG 
CGTCGCCCGC CCGGATCGCC AGATCCAGGA' 
GCAGGGAGTG CGCCAGCGGA TCGGCGGCGT 
TGCCGGGCAG GGTGACCGCC GCGGTCAGCG 
GGGC.CGCGTC GCCCGCGGTC TGGGTGCCGA 
TCGGCGGGTG CGAGGCGCGT GCCGGCGCGG 
CGCCGTCGGT GTGCAGCCGG GCGAGCGCGG 
GCAGCATCGG GATGCCGTCG ACGAGTCGGG 
GCACCGCCCC GTCGTGCGCG GCGACCTGTT 
GTACCCAGTA CTCCGGCGTG GTGCAGGCGG 
GGTACGTCAG GCTCTCCGCG ACCTTGCGGA 
AGTGGAACGC GTGGCTGGTC CGCAGGCGGG 
GCACCGCCTC CTCGTCACCG GAGAGCACGA 
CCACGCCGTC CCGCAGCAGC GGCAGCGCGT 
CCCCGCCGGA CGGCAGCGCC TGCATCAGGC 
CCTCCAGGGA CCAGACGCCG GCGACGTACG 
CGAAGGCGTC CGGGCGTACG CCCCACGCCT 
CGAACACCGC GGGCTGGGCG TACCCGGTGT 
GGGCGTCCAG CACCTCGCGG CGAGTGCGGG 
CGCCCATGCC GGGACGTTGT GAGCCCTGTC 
GTTCTGCGGC GCCGGTGACC GTGTCGGTGC 
TGCGGGCGAG CAGGGCCGCG GCCACCGCGC 
CGCGCAGGCG GTGTACCTGT GCGTCGAGTG 
GCAGCGGTCC GGTGTCGGGT GCCGGGGCGG 
GGATGATGTG AGCGTTGGTG CCGCTAACGC 
GGTCGGTTTC GGGCCAGGGG CGGGCGTCGG 
CGACGTGCGA GGACGGCGTG TCCACGTGCA 
CGAGGACCAT CTTGATGACA CCGGCGACGC 
ACTTCAGCGA GCCCAGCAGC ACCGGGGTGT 
CCTGCGCCTC GATGGGGTCG CCCAGCCTGG 
CATCCGCCGG GGTGAGCCCG GCGTTGGCCA 
GCCCGTTCGG CGCCGACAAC CCGTTGGAAG 
CGACCGCCAG GACATTGTGG CCGTGCCGCT 
CACCGGATCC CTCGGCGAAA CCGGTGCCAT 
CGTCCGGGGA GAGGCCCCGC TGCTGGGAGA 
CCGTGACGCC GCCGACCACG GCGAGCGAGC 
GGTGCAGCGC CACCAGCGAC GACGAACACG 
AACCGTAGAA GTACGACAGC CGACCGGACA 
AACCGCCGCG GTCGGCTCCA GTGCCGTACC 
TGTCGCTTCC GCGCAGCGAC TCCGGGAGGA 
TCTCCAGGAC CAGACGCTGC TGCGGGTCCA 
AGAACGCCGC GTCGAAGTCC GCCACCCCGG 
TGCCCGGATG ATCCGGATCG GGATCGTACA 
GAAACGCCGT GATCCCGTCA CCACCCGACT 
CGACCCCACC CGGCAGCCGG CAGGCCATCC 
CGGCCGCGGT CGGGGTACGC CGCCGGGTGG 
GGGCGGCGAG CGCCTGCGCC GTGGGGTGGT 
CCGTCGCGTC GGCCAGCCGG TTGCGCAGTT 
CCCTGAACGC GCGCGCGGGT GCGATGGCGT 
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33661 GGGCGTCGCG GTGGCCGAGC ACCGCGGCAG CGCTGGTACG GACGAGGTCG AGCATGTCGC 
33721 GCGCGGCCGG AGGTGCGGAC GTGCGCCGGA CGGCCGGCAC GAGGGTGCGT AGGACCGGCG 
337 81 GGACCCGGTC GGACGCGGCG ACGGCGGCGA GGTCGAGCCG GATCGGCACG AGCGCGGGCC 
33841 GGTCGGTGTG CAGGGCCGCG TCGAACAGGG CGAGCCCCTG TGCGGCCGTC ATCGGGGTCA 
33901 TGCCGTTGCG GGCGATGCGG GCCAGGTCGG TGGCGGTCAG CCGCCCGCCC ATCCCGTCCG 
33961 CCGCGTCCCA CAGTCCCCAG GCGAGCGAGA CGGCGGGCAG CCCCTGGTGG TGCCGGTGGC 
34 021 GGGCGAGCGC GTCGAGGAAC GCGTTGCCGG TCGCGTAGTT GGCCTGACCC GCGCCGCCGA 
34 081 ACGTGGCGGA TATGGACGAG TACAGGACGA ACGCGGCCAG GTCGAGATCG CGCGTCAGCT 
34141 CGTGCAGGTG CCAGGCGACG TCCGCCTTGA CCCGCAGCAC GGCGTCCCAC TGCTCCGGCC 
34 201 GCATGGTCGT CACGGCCGCG TCGTCGACGA TCCCGGCCAT GTGCACGACG GCGCGCAGCC 
34 2 61 GCTGGGCGAC GTCGGCGACG ACTGCGGCCA GCTCGTCGCG GTCGACGACG TCGGCGGCCA 
34 321 CGTACCGCAC GCGGTCGTCC TCCGGCGTGT CGCCGGGCCG GCCGTTGCGG GACACCACGA 
34 381 CGACCTCGGC GGCCTCGTGC ACGGTGAGCA GGTGGTCCAC GAGGAGGCGG CCGAGCCCGC 
34441 CGGTGCCGCC GGTGACGAGG ACGGTCCCGC CGGTCAGCGG GGAGGTTCCG GTGGCCGCGG 
34 501 CGACACGGCG CAGACGGGCC GCACGCGCTG TGCCGTCGGC GACCCGGACG TGCGGCTCGT 
34 561 CGCCGGCGGC GAGCCCGGCC GCTATGGCGG CGGGCGTGAT CTCGTCCGCT TCGATCAGGG 
34 621 CGACGCGGCC GGGATGCTCC GTCTCCGCCG TCCGGACCAG GCCGCCGAGC GCTTCCTGCG 
34 681 CGGGATCGCC GGTACGGGTG GCCACGATGA GCCGGGATCG CGCCCAGCGC GGCTCGGCGA 
34 741 GCCAGGTCTG CACGGTGGTG AGCAGGTCGC GGCCCAGCTC CCGGGTCCGG GCGCCGGGCG 
34 801 AGGTGCCCGG GTCGCCGGGT TCCACGGCCA GGACCACGAC CGGGGGGTGC TCGCCGTCGG 
34 8 61 GCACGTCGGC GAGGTACGTC CAGTCGGGGA CGGGTGACGC GGGCACGGGC ACCCAGGCGA 
34 921 TCTCGAACAG CGCCTCGGCA TCGGGGTCGG CGGCCCGCAC GGTCAGGCTG TCGACGTCAA 
34 981 GGACCGGTGA GCCGTGCTCG TCCGTGGCGA CGATGCGGAC CATGTCGGGG CCGACGCGTT 
35041 CCAGCAGCAC GCGCAGCGCG GTCGCGGCGC GCGCGTGGAT CCTCACGCCG GACCAGGAGA 
35101 ACGCCAGCCG GCGCCGCTCC GGGTCCGTGA AGACCGTCCC GAGGGCGTGC AGGGCCGCGT 
35161 CGAGCAGCAC GGGGTGCAGC CCGTACCGGG CGTCGGTGAG CTGTTCGGCG AGGCGGACCG 
35221 ACGCGTAGGC GCGGCCCTCC CCCGTCCACA TCGCGGTCAT GGCCCGGAAC GCGGGCCCGT 
35281 ACGAGAGCGG CAGCGCGTCG TAGAAGCCGG TCAGGTCGGC CGGGTCGGCG TCGGCGGGCG 
35341 GGCAGTCCAC GGGCTCCGCC GGACCGCCAG TGTCCACGCT CAGCGCTCCG GTCGCACTGA 
354 01 GCGCCCAGGG GCCCGTGCCG GTACGGCTGT GCAGACTCAC CGACCGCCGT CCGGACACCT 
354 61 CGGTTCCGAC GGTGGCCTGG ATCTCCGTGT CGCCGTCGCC GTCGACCACC ACCGGCGCGA 
35521 CGATGGTCAG CTCCGCGATC TCCGGCGTGC CGAGCCGGGC TCCCGCTTCG GCGAGCAGTT 
35581 CCACGAGCGC CGAGCCGGGC ACGATGACCC GGCCGTCCAC CTCGTGGTCG GCGAGCCAGG 
35641 GCTGACGGCG TACCGAGACA CCGCGGTGGC CAGCGCGCCC TCGCCGTCGG GCGAGGTCGA 
35701 CCCACGAGCC GAGCAGCGGG TGGCCGGACG TTCCCGCCGG TTCCGCGTCG ATCCAGTAGC 
35761 GGTCACGGCG GAACGGGTAC GTGGGCAGCG GCACCACCCG ACGCGTCGCG AACGACCAGG 
35821 TGACGGGCAC GCCCCGGACC CAGAGCGCGG CGAGCGACCG AGTGAAGCGG TCCAGGCCGC 
35881 CCTCGCCTCG CCGCAGTGTG CCGGTGACGA 'CCGTATGCGC ATGCCCGGCG AGCGTGTCCT 
35941 CCAGTGCGGT GGTGAGCACG GGATGCGCGC TGACCTCGAC GAACGCGCGG TATCCGCGGT 
36001 CCGCCAGGTG GCCGGTCGCG GCGGCGAACC GAACGGTGCG GCGCAGGTTG TCGTACCAGT 
36061 AGGCGGCGTC CGCGGGCCGG TCCAGCCACG CCTCGTCCAC GGTGGAGAAG AACGGGACGT 
36121 CCGGCGTGCG CGGAGTGATG CCGGCGAGAG CGTCGAGCAG CGCGCCGCGG ATCGTTTCGA 
36181 CATGCGCGGT GTGCGACGCG TAGTCGACGG CGATCCGGCG GGCGCGGGGG GTGGCGGCCA 
36241 GCAGCTCCTC CACGGCGTCG GCCGCACCGG CGACAACGAT CGACGCGGGT CCGTTGACCG 
36301 CGGCGACCTC CAGGCGCCCG GCCCACACGG CGGCGTCGAA GTCGGCGGGC GGCACCGAGA 
36361 CCATGCCGCC CTGCCCGGCC AGTTCGGTGG CGACGAGTCG GCTGCGCACC GCGACGACCT 
364 21 TCGCGGCGTC GTCCAGGGTG AGCACCCCGG CGACGCAGGC CGCGGCGACT TCGCCCTGGG 
36481 AGTGGCCGAC GACCGCGGCC GGGGCGACCC CGTGCGCACG CCACAGCTCC GCCAGCGCCA 
36541 CCATCACCGC GAACGACGCG GGCTGCACGA CATCGACCCG GTCGAACGCG GGCGCTCCGG 
36601 GCCGCTGGGC GATGACGTCC AGCAGGTCCC ATCCGGTGTG CGGGGCGAGC GCCGTGGCGC 
36661 ACTCGCGGAG CCGCCGGGCG AACACGGGCT CGGTGGCGAG CAGTTCGGCA CCCATGCCGG 
36721 CCCACTGGGA GCCCTGCCCG GGGAACGCGA ACACGACACG TGTGTCGGTG ACGTCGGCGG 
36781 TTCCCGTCAC GGCCCCCGGC ACTTCGGCAC CACGGGCGAA CGCCTCCGCC TCTCGGGCCG 
36841 GCACGACCGC CCGGTGGCGC ATGGCCGTCC GGGTGGTGGC GAGCGAGTGG CCGACCGCGG 
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36901 CCGCGGCGCC AGTGAGCGGG GCCAGCTGTC CCGCGACGTC CCGCAGTCCC TCCGGGGTCC 
36961 GGGCCGACAT CGGCCAGACC ACGTCCTCGG GCACCGGCTC GGCTTCGGGT GCGGACACGG 
37 021 GTGCGGGCGC GGCGGGGGGC CCGGCCTCCA GGACGACATG GGCGTTGGTG CCGCTGATGC 
37081 CGAACGACGA. GACACCCGCA CGCCGGGCGC GCCCGGTGAC CGGCCACGGC TCACTGCGGT 
5 37141 GCAGCAGCCG GATGTCGCCG TCCCAGTCGA CGTGCCGGGA CGGCTCGTCG ACGTGCAGCG 
37201 TGCGCGGCAG GACGCCGTGC CGCATCGCCA TGACCATCTT GATGACGCCG GCGACGCCGG 
37 2 61 CCGCGGCCTG GGTGTGGCCG ATGTTCGACT TGAGCGAGCC GATCAGCAGC GGATGCACGC 
37 321 GTTCGCGCCC GTAGGCCACT TGCAGGGCCT GGGCCTCGAC GGGGTCGCCG AGACGGGTGC 
37 381 CGGTGCCGTG TGCCTCCACG GCGTCGACGT CACCCGGCGC CAGGCCGGCG TCGGCGAGCG 

10 374 41 CACGCTGGAT GACGCGCTGC TGCGCAGGCC CGTTCGGGGC GGACAGCCCG TTCGACGCGC 
37 501 CGTCGGAGTT GACCGCGGAG CCGCGCACCA GCGCCAGCAC GGGGTGGCCG TGGCGGGTGG 
37 561 CGTCGGAGAG CCGCTCCAGC ACCAGGACAC CGGCGCCCTC GGCGAAGCTC GTGCCGTCCG 
37 621 CGGTGTCCGC GAAGGCCTTG GCACGGCCGT CGGGGGCGAG CCCGCGCTGC CGGGAGAACT 
37 681 CGACGAACCC GGTCGTCGTC GCCATCACCG TGACACCGCC GACCAGGGCG AG C GAG C ACT 

15 377 41 CCCCCGAGCG CAGCGACCGC GCGGCCTGGT GCAGCGCCAC CAGCGACGAC GAACACGCCG 
37801 TGTCGACGGT GACCGACGGG CCCTCCAGAC CGAAGTAGTA CGAGAGCCGC CCGGAGAGAA 
37861 CGCTGGTCGG CGTGCCGGTC GCCCCGAAAC CGCCCAGGTC CACGCCCGCG CCGTAGCCCT 
37 921 GGGTGAACGC GCCCATGAAT ACGCCGGTGT CGCTGCCGCG GACGCTTTCG GGCAGGATGC 
37 981 CCGCTCGTTC GAACGCCTCC CACGACGCTT CGAGGACCAG ACGCTGCTGC GGGTCCATCG 

20 38041 CCAGCGCCTC ACGCGGGCTG ATCCCGAAGA ACGCGGCGTC GAAGTCGGCG GCGCCGGTGA 
38101 GGAAGCCGCC GTGACGCACG GAAACCTTGC CGACCGCGTC GGGGTTCGGG TCGTAGAGCG 
38161 CGGCGAGGTC CCAGCCGCGG TCGGCGGGGA ACTCGGTGAT CGCGTCCCCG CCGGAGTCGA 
38221 CCAGCCGCCA CAGGTCCTCC GGTGACCGCA CGCCACCGGG CATCCGGCAC GCCATGGCCA 
38281 CGATCGCCAG CGGCTCGTTC CCCGCCACCG TCGGTGCGGG CACTGTCGCC GCCGGAGCGG 

25 38341 CAGGGGCCGG CTCACCCCGC CGTTCCTCAT CCAGGCGGGC GGCGAGCGCG "GCCGGTGTCG 
38401 GGTGGTCGAA GACGGCCGTC GCGGAGAGCC GTACCCCCGT ^CGTCTCGGCG AGGCTGTTGC 
384 61 GCAACCGGAC ACCGCTGAGC GAGTCGATGC CGAGGTCCTT GAACGCCGTC GTGGGCGTGA 
38521 TCTCGGAGGC GTCGGCGTGG CCGAGCACGG CGGCCGTGGC CGCACACACG ATGGCCAGCA 
38581 GGTCACGATC GCGGTCGCGG TCGCGGTCGC GGTTGTCCTC CGCACGGGCG GCGATGCGGC 

30 38641 GCTCGGTCCG CTGCCGGACG GGCTCGGTGG GAATCGCCGC GACCATGAAC GGCACGTCCG 
387 01 CGGCGAGGCT CGCGTCGATG AAGTGGGTGC CCTCGGCCTC GGTGAGCGGC CGGAACCCGT 
387 61 CGCGCACCCG CTGCCGGTCG GCGTCGTCAA GTTGTCCGGT GAGGGTGCTG GTGGTGTGCC 
38821 ACATGCCCCA GGCGATGGAG GTGGCGGGTT GGCCGAGGGT GTGGCGGTGG GTGGCGAGGG 
38881 CGTCGAGGAA GGCGTTGGCG GCGGCGTAGT TTCCTTGTCC GGGGCTGCCG AGGACGGCGG 

35 38 941 CGGCGCTGGA GTAGAGGACG AAGTGGGTGA GGGGTTGGTT TTGGGTGAGG TGGTGCAGGT 
39001 GCCAGGCGGC GTTGGCTTTG GGGTGGAGGA CGGTGGTGAG GCGGTCGGGG GTGAGGGCGT 
39061 CGAGGATGCC GTCGTCGAGG GTGGCGGCGG TGTGGAAGAC GGCGGTGAGG GGTTGGGGGA 
39121 TGTGGGCGAG GGTGGTGGCG AGTTGGTGGG GGTCGCCGAC GTCGCAGGGG AGGTGGGTGC 
39181 CGGGGGTGGT GTCGGGGGGT GGGGTGCGGG AGAGGAGGTA GGTGTGGGGG TGGTTCAGGT 

40 39241 GGCGGGCGAG GATGCCGGCG AGGGTGCCGG AGCCGCCGGT GATGATGATG GCGTGTTCGG 
39301 GGTTGAGGGG GGTGGTGGTG GGTGGGGTGG TGGTGTGGAG GGGGGTGAGG TGGGGTCGGT 
39361 GGAGGGTGTG GTGGGTGAGG CGGAGGTGGG GGTGGTCGAG GGTGGCGAGT TGGGCCAGGG 
39421 GGAGGGGAGT GTGGGGGTGG TCGGTTTCGA TGAGGCGGAT GCGGTGGGGG TGTTCGTTCT 
39481 GGGCGGTGCG GGTGAGGCCG GTGACGGTGG CGCCGGCGGG GTCGGTGGTG GTGTGGACGA 

45 39541 TGAGGGTGTG GTCGGTGGTG GTGAGGTGGT GTTGCAGGGC GGTCAGGACG CGGGTGGCGC 
39601 GGGTGTGGGC GCGGGTGGGT ATGTCCTCGG GGTCGTCGGG GTGGGCGGCG GTGATCAGGA 
39661 CGTGTCCCTC GGGCAGGTCA CCGTCGTAGA CCGCCTCGGC GACCGCGAGC CACTCCAACC 
39721 GGAGCGGGTT CGGCCCCGAC GGGGTGTCGG CCCGCTCCCT CAGCACCAGC GAGTCCACCG 
39781 ACACGACAGG ACGGCCATCC GGGTCGGCCA CGCGCACGGC GACGCCGGCC TCCCCCCGGG 

50 39841 TGAGGGCGAC GCGCACCGCG GCGGCCCCGG TGGCGTTCAG GCGCACGCCC GTCCAGGAGA 
39901 ACGGCAGCTC GATCCCGCCG CCCGCGTCGA GGCGCCCGGC GTGCAGGGCC GCGTCGAGCA 
39961 GTGCCGGATG C AC AC CG AAA CCGTCCGCCT CGGCGGCCTG CTCGTCGGGC AGCGCCACCT 
4 0021 CGGCATACAC GGTGTCACCA TCACGCCAGG CAGCCCGCAA CCCCTGGAAC GCCGACCCGT 
4 0081 ACTCATAACC GGCATCCCGC AGTTCGTCAT AGAACCCCGA GACGTCGACG GCCGCGGCCG 
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4 0141 TGGCCGGCGG CCACTGCGAG AACGGCTCAC CGGAAGCGTT GGAGGTATCC GGGGTGTCGG 
4 0201 GGGTCAGGGT GCCGCTGGCG TGCCGGGTCC AGCTGCCCGT GCCCTCGGTA CGCGCGTGGA 
4 0261 CGGTCACCGG CCGCCGTCCG GCCTCATCGG CCCCTTCCAC GGTCACCGAC ACATCCACCG 
4 0321 CTGCGGTCAC CGGCACCACG AGCGGGGATT CGATGACCAG TTCATCCACC ACCCCGCAAC 
40381 CGGTCTCGTC ACCGGCCCGG ATGACCAGCT CCACAAACGC CGTACCCGGC AGCAGAACCG 
4 0441 TGCCCCGCAC CGCGTGATCA GCCAGCCAGG GATGCGTACG CAATGAGATC CGGCCGGTGA 
40501 GAACAACACC ACCACCGTCG TCGGCGGGCA GTGCTGTGAC GGCGGCCAGC ATCGGATGCG 
4 0561 CCGCCCCGGT CAGCCCGGCC GCGGACAGGT CGGTGGCACC GGCCGCCTCC AGCCAGTACC 
4 0621 GCCTGTGCTC GAACGCGTAG GTGGGCAGAT CCAGCAGCCG CCCCGGCACC GGTTCGACCA 
4 0681 CCGTGCCCCA GTCCACCCCC GCACCCAGAG TCCACGCCTG CGCCAACGCC CCCAGCCACC 
4 0741 GCTCCCAGCC ACCGTCACCA GTCCGCAACG ACGCCACCGT GCGGGCCTGT TCCATCGCCG 
4 0801 GCAGCAGCAC CGGATGGGCA CTGCACTCCA CGAACACCGA CCCGTCCAGC TCCGCCACCG 
4 0861 CCGCATCCAG CGCGACAGGG CGACGCAGGT TCCGGTACCA GTACCCCTCA TCCACCGGCT 
4 0921 CGGTCACCCA GGCGCTGTCC ACGGTCGACC ACCACGCCAC CGACCCGGTC CCGCCGGAAA 
4 0981 TTCCCTTCAG TACCTCAGCG AGTTCGTCCT CGATGGCCTC CACGTGAGGC GTGTGGGAGG 
41041 CGTAGTCGAC CGCGATACGA CGCACCCGCA CCCCATCAGC CTCATACCGC GCCACCACCT 
41101 CCTCCACCGC CGACGGGTCC CCCGCCACCA CCGTCGAAGC CGGACCATTA CGCGCCGCGA 
41161 TCCACACACC CTCGACCAGA CCCACCTCAC CGGCCGGCAA CGCCACCGAA GCCATCGCCC 
41221 CCCGGCCGGC CAGCCGCGCC GCGATCACCC GACTGCGCAA CGCCACCACG CGGGCGGCGT 
41281 CCTCCAGGCT GAGGGCTCCG GCCACACACG CCGCCGCGAT CTCCCCCTGC GAGTGTCCGA 
41341 CCACAGCGTC CGGCACGACC CCATGCGCCT GCCACAGCGC GGCCAGGCTC ACCGCGACCG 
414 01 CCCAGCTGGC CGGCTGGACC ACCTCCACCC GCTCCGCCAC ATCCGACCGC GACAACATCT 
414 61 CCCGCACATC CCAGCCCGTG TGCGGCAACA ACGCCCGCGC ACACTCCTCC ATACGAGCCG 
41521 CGAACACCGC GGAACGGTCC ATGAGTTCCA CGCCCATGCC CACCCACTGG GCACCCTGCC 
41581 CGGGGAAGAC GAACACCGTA CGCGGCTGAT CCACCGCCAC ACCCATCACC CGGGCATCAC 
41641 CCAGCAGCAC CGCACGGTGA CCGAAGACAG CACGCTCACG CACCAACCCC TGCGCGACCG 
41701, CGGCCACATC CACCCCACCC CCGCGCAGAT ACCCCTCCAG CCGCTCCACC TGCCCCCGCA 
417 61 GACTCACCTC ACCACGAGCC GACACCGGCA ACGGCACCAA CCCATCACCA CCCGACTCCA 
41821 CACGCGACGG CCCAGGAACA CCCTCCAGGA TCACGTGCGC GTTCGTACCG CTCACCCCGA 
41881 ACGACGACAC ACCCGCATGC GGTGCCCGAT CCGACTCGGG CCACGGCCTC GCCTCGGTGA 
41941 GCAGCTCCAC CGCACCGGCC GACCAGTCCA CATGCGACGA CGGCTCGTCC ACGTGCAGCG 
4 2001 TCTTCGGCGC GATCCCATGC CGCATCGCCA TGACCATCTT GATGACACCG GCGACACCCG 
4 2061 CAGCCGCCTG CGCATGACCG ATGTTCGACT TGACCGAACC GAGGTAGAGC GGCGTGTCGC 
4 2121 GGTCCTGCCC GTAGGCCGCG AGGACGGCCT GCGCCTCGAT CGGGTCGCCC AGCCGCGTGC 
4 2181 CGGTGCCGTG CGCCTCCACC ACGTCCACAT CGGCGGCGCG CAGTCCGGCG TTGACCAACG 
42241 CCTGCCGGAT CACGCGCTGC TGGGCGACGC CGTTGGGGGC GGACAGTCCG TTGGAGGCAC 
4 2301 CGTCCTGGTT CACCGCCGAG CCGCGGACGA CCGCGAGAAC GGTGTGCCCG TTGCGCTCGG 
4 2361 CGTCGGAGAG CCGCTCCAGC ACGAGAACGC CGACGCCCTC GGCGAAGCCG GTCCCGTCCG 
4 2421 CCGCGTCGGC GAACGCCTTG CACCGTCCGT CCGGGGAGAG TCCGCGCTGC CGGGAGAACT 
424 81 CCACGAGCTC TGCGGTGTTC GCCATGACGG TGACACCGCC GACCAGCGCC AGGGAGCACT 
4 2541 CCCCGGCCCG CAGTGCCTGT GCCGCCTGGT GCAGGGCGAC CAGCGACGAC GAGCACGCCG 
42601 TGTCGACCGT GACCGCCGGG CCCTGAAGTC CGTACACGTA CGAGAGGCGC CCGGACAGGA 
42 661 CGCTCGTCTG CGTCGCCGTG ACACCGAGCC CGCCCAGGTC CCGGCCGACG CCGTAGCCCT 
42721 GGTTGAACGC GCCCATGAAC ACGCCGGTGT CGCTCTCCCG GAGCCTGTCC GGCACGATGC 
42781 CGGCGTTCTC GAACGCCTCC CAGGAGGTCT CCAGGATCAG GCGCTGCTGG GGGTCCATCG 
42841 CCAGCGCCTC GTTCGGACTG ATGCCGAAGA ACGCGGCGTC GAACCCGGCG CCGGCCAGGA 
42 901 ATCCGCCGTG GCGTGTCGTG GAGCGGCCGG CCGCGTCCGG GTCCGGGTCG TACAGCGCGT 
4 2961 CGACGTCCCA GCCCCGGTCG GTGGGGAACT CGGTGATCGC CTCGGTACCG GCGGCGACGA 
4 3021 GCCGCCACAG GTCCTCCGGC GAGGCGACCC CGCCGGGCAG TCGGCACGCC ATGCCGACGA 
4 3081 TCGCGACGGG GTCGCCGGAG CCGAGGGTCT GGGCGGTCGC GGGTGCCGCT GTCGCGGAGC 
4 3141 CGGCGAGGTG GGCGGCGAAC GCACGCGGAG TGGGGTGGTC GAACGCGGTT GACGCGGGCA 
4 3201 CCCGCAGACC CGTCCGCGCG GCGACGGTGT TGGTGAACTC GACGGTGGTG AGCGAGTCGA 
4 3261 GGCCGTTCTC GCGGAACGTG CGGTCCGGGG AGCAGTGTCC GGCGCCCGGC AGGCCCAGGA 
4 3321 CGGTGGCGAC GCTGTCGCGG ACCAGGTCGA GCAGTACGTC CTCCCGGCCC GCACGGGCCG 
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4 3381 CGGCGAGGCG GTTCGCCCAC TCCTGTTCCG 
4 3441 CGGTGAGGAT CGGCGGCGTG GCGCCCGCCA 
4 3501 TCCGGGCCAC GATGTACGAG CCGCCGCCCG 
4 35 61 GCGCCGGCCG TTCGATGCCG GGCAGCGCGC 
43621 CCCGTGGCCG GGTGTGGGCG TCGGCGCCGG 
4 3681 CGCCGGGGTT CGCGGCTTCC TCGGCTGCGG 
4 3741 GGAGCAGGCC GGCGACGGTG TCGGCGTCCT 
4 3801 CGATCGGAGG CGGCACGGTG AGGACCATCT 
4 38 61 CGAACGCGTC CCGCGCACGG CGGATGTCCC 
4 3921 CGCGGTCGAA CAGGTCGAGG AGCAGTTCGA 
4 3981 CGGCCAGGTC GAACGGCTGC TGGGCGGCGT 
4 4 041 ACCGGCCGCC CGGTGCGAGC AGGCCGATGG 
4 4101 TGAGCACGAC GTCGACCGGC GGGAAGGTGT 
44161 CATGGTCGGT GTCGAAGCCG TCGGCGTGCA 
4 4221 CGTACACCTC GGCGCCGAGG TGGCGGGCGA 
4 4 281 TCGCGGCGTG GACCAGGACC TTCTGGCCGG 
4 4 341 ACCAGGCGGT GGCGAACACG ATGGGCACGG 
44 4 01 GGATCCGTGC GACCAGCCGC CGGTCCGCGA 
444 61 GACCGAACAC GCGGTCGCCG GGGGCCAGGT 
4 4 521 TGCCCGCGGC CTCCCCGCCC ATCTCGCCCT 
4 4 581 CGTCGCGGAA GTTCAGCCCC GCGGCGCGGA 
4 4 641 GCGCGGCGGG ACGTCGAGCG GGGCGACGAC 
4 4 701 GCGCAGCGCC CACTGGCGCG GTCGGCAGGG 
44761 CGTAGGCCAC GCCGGCCCGC AGCGCGATCT 
4 4 821 CGAGGTCGTC ATCGCCGTCC GTGTCCACCA 
4 4 881 GGCGCAGCGC CTCGTCCCAG AGCCGGGCCT 
4 4 941 CGCCCACCGC GCGGCGGGTG ACGACCGTCC 
4 5001 GCCGCTCCCA GACCAGTTCG CACAGCGTGG 
4 5061 CCGGCAGCCC CGCGAGCCGC GCGCGCTGGA 
4 5121 TGACGTGCCA GATCTCGTCG GGCACCTTGA 
45181 GGATCGCCTC GGCGGGGACG CGGGGGCCGT 
4 5241 CGAGGACGGG GTGCGGGCGG CCCGCCGCGG 
4 5301 CGACGGTCTC GATCTCCCGG GGGTGGATGT 
4 5361 CCCGGCCGGT GATCGTCACG TGTCCGGTCT 
4 5421 ACCAGCCGTC CACGAGCACC TGGGCGGTCG 
4 5481 GGCTCGGCCC GCTCGCCCAC AGCTCGCCCT 
4 5541 CCGGGTCGAC GAACCGCAGC GACAGGCCCG 
4 5601 GCGCATCCTC CAGGGTGTTG GCGGTGAGCG 
4 5661 CGAGCAGGGG CACGCCGAAC GTCGCCTCGA 
4 5721 ATCCGGCGAC CAGCGCCACG CGCAGCGCGC 
4 5781 GGAGGTAGCG GTACATCGTC GGCACGCCGA 
4 5841 CGTCGAGGAC GTCACGCGCG ACGAAGCCGC 
4 5901 GGACGGCGAG CAGGCAGAGG TGGTGGCCGA 
45961 GTTCGTCGTC CTCGGTCAGC CGCCAGGACG 
4 6021 CGCTGCGCTG TGCGGAAACC ACGCCCTTGG 
4 6081 TCCAGGCGGG TTCGTCCAGG CCGAGGTCGT 
4 6141 CGAGGTCCTC GTAGGAGACG CAGTCCGGTG 
4 6201 CGGTGCCGGT GCGGCGCACC TGGTCGAGGT 
4 6261 CGGAGTCCGT CAGGAAGTGG GCGAGTTCGG 
4 6321 CGACGGCGGC GGCGCGGGCG GCGGCGAGGT 
4 6381 GCAGCATCGC GACCCGGTCG CCGCGGTCGA 
4 6441 GGCCGGCCCG GAGCCGGAGT TGCGTGTACG 
4 6501 TCCGGTCGCC GCGTCGCTCG GCATGGATGC 
4 6561 CCACACGCGC CATGGAAACA CCTTTCTCTC 



TGGCGTCGGG CTCGGCCGGT CCGGTCAGTG 
TCGTCGCGGC CCGCGCCCCG GCGGAACCGG 
CGATGGCCTT CTCGATCAGG TCGCCGGTGA 
GGACGGTGAC GGTGGGGAGT CGCTCCGCGG 
CCGGGCCGTC GAGCAGGACG TGCACGAGCG 
TGGTCACGTG GGTGAGGCCG GTCTCGTCGC 
CCCCGGTGAC CAGGACCGGC GCGTCCGGGC 
TGCCGGTGTG CCGGGCGTGG CTCATCCACG 
ACGGCTGCAC CGGCAGCGGG CACAGCTCAC 
GGATCTCCCG CAGGCGCGCG GGATCCACGT 
GGCGGATGTC GGTCTTGCCC ATCTCGACGA 
ACGCGTCGAG GAGTTCACCG GTGAGCGAGT 
CGGCGAACGC GGCGCTGCGG GAGTTCGCCA 
GCAGGTGTTG TTTGGCGGGA CTGGCGGTGG 
TCCGGGTCGC CGCCATGCCG ACACCGCCCG 
GTCGCAGCTC GCCCGCGTCG ACGAGGCCGT 
ACGCGGCGAT GGGGAACGAC CATCCCCGTG 
CCACGCTGCG CCGGAACGCG TCCTGCACGA 
CGTCGACGCC GGGTCCGACT TCGGTCACGA 
CGCCCGGGTA GGTGCCGAGC GCGATCAGCA 
CGTCGATGCG GACCTCGCCG GCGGCCAGGG 
GAGGTCGCGG AGCGTTCCGG AGGCGGGCGG 
GGGTGGTGTC CGCGCGTACC AGCCGGGGCA 
GGGGTTCGCC GAGCGAGGCC GCGGCGGGGA 
GCACGAACGA TCCGGGTTCG GCGGCCTGGC 
GGTCCGCGTC CGGGATCTCG GCCGGGCCGA 
GGCGGGGTGA CGGGGTGCCG GGCAGGTCGC 
CCTCGCCACT GCCGGTGGCG ACCAGATGGG 
CCTTGCCCGA CGCGGTGCGG GGGATCGTGG 
AGTAGGCGAG CCGGCGGCGG CACTCGGCGA 
CGGAAACGAC GTAGAGCACG GGTATGTCGC 
CGGCGTCCCG GACACCGGCC ACCTCCTGGG 
TCTCCCCGCC GCGGATGATC AGCTCCTTGA 
CGGCCTGACG TGCGAGGTCC CCGGTGCGGT 
CCTCCGGCTG GGCGTGGTAG CCGAGCATGA 
CCTCGCCGGG TGCCACGTCG GCGCCGGACA 
GCACGGGCAG CCCGCACGAG CCGGGAACCC 
AGCCGGTCGT CTCGGTGCAG CCGTACGTGT 
AATCCCTGGT GAGCGACGCC GGCGAGGTGG 
GAGCCCGCGG CTCGCCGGAC ACGGCGCCGA 
CGAGCACGGT GCTGGAGTGT TCGGCCAGGG 
CCAGGATACG GGCGGACGCG CCGACCGTGA 
GGCTGTGGAA CAGCGGGGCG GGCCAGAGCA 
GCACGTCGCA GTGCATCGCG GACCACAGGC 
GACGGCCGGT GGTGCCGGAG GTGTAGAGCA 
CGCGGGGCGG GCACGGCGGC TCGGTCCCGG 
CCCGGCGCCC GACGAGCACG ACGGTGGCGT 
GGGTTTCGTC GGTGACCAGC ACGGTCGCGC 
CGTCGGCGGC GTCCGGGTTG AGCGGGACGG 
AGACCTCGAT GGTCTCGATC CGGTTGCCGA 
CGCCGGACGC GGCGAGGTGT CCGGCGAGCC 
TCACGGCGCG TTGGGAATCC GTGTAGGCGA 
GGAGCAATTC GTGCAACGGC CGGATTGGTT 
GACCAACCGC ACAACAGCAC GGAACCGGCC 
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4 6621 ACGAGTAGAC GCCGGCGACG CTAGCAGCGT TTTCCGGACC GCCACCCCCT GAAGATCCCC 
4 6681 CTACCGTGGC CGGCCTCCCC GGACGCTCAT CTAGGGGGTT GCACGCATAC CGCCGTGCGT 
4 6741 AATTGCCTTC CTGATGACCG ATGCCGGACG CCAGGGAAGG GTGGAGGCGT TGTCCATATC 
4 6801 TGTCACGGCG CCGTATTGCC' GCTTCGAGAA GACCGGATCA CCGGACCTCG AGGGTGACGA 
4 68 61 GACGGTGCTC GGCCTGATCG AGCACGGCAC CGGCCACACC GACGTGTCGC TGGTGGACGG 
4 6921 TGCTCCCCGG ACCGCCGTGC ACACCACGAC CCGTGACGAC GAGGCGTTCA CCGAGGTCTG 
4 6981 GCACGCACAG CGCCCTGTCG AGTCCGGCAT GGACAACGGC ATCGCCTGGG CCCGCACCGA 
47041 CGCGTACCTG TTCGGTGTCG TGCGCACCGG CGAGAGCGGC AGGTACGCCG ATGCCACCGC 
4 7101 GGCCCTCTAC ACGAACGTCT TCCAGCTCAC CCGGTCGCTG GGGTATCCCC TGCTCGCCCG 
47161 GACCTGGAAC TACGTCAGCG GTATCAACAC GACGAACGCG GACGGGCTGG AGGTGTACCG 
4 7221 GGACTTCTGC GTGGGCCGCG CCCAGGCGCT CGACGAGGGC GGGATCGACC CGGCCACCAT 
4 7281 GCCCGCGGCC ACCGGTATCG GCGCCCACGG GGGCGGCATC ACCTGCGTGT TCCTCGCCGC 
4 7341 CCGGGGCGGA GTGCGGATCA ACATCGAGAA CCCCGCCGTC CTCACGGCCC ACCACTACCC 
4 7401 GACGACGTAC GGTCCGCGGC CCCCGGTCTT CGCACGGGCC ACCTGGCTGG GCCCGCCGGA 
474 61 GGGGGGCCGG CTGTTCATCT CCGCGACGGC CGGCATCCTC GGACACCGAA CGGTGCACCA 
4 7521 CGGTGATGTG ACCGGCCAGT GCGAGGTCGC CCTCGACAAC ATGGCCCGGG TCATCGGCGC 
4 7581 GGAGAACCTG CGGCGCCACG GCGTCCAGCG GGGGCACGTC CTCGCCGACG TGGACCACCT 
47 641 CAAGGTCTAC GTCCGCCGCC GCGAGGATCT CGATACGGTC CGCCGGGTCT GCGCCGCACG 
4 7701 CCTGTCGAGC ACCGCGGCCG TCGCCCTTTT GCACACCGAC ATAGCCCGCG AGGATCTGCT 
4 77 61 CGTCGAAATC GAAGGCATGG TGGCGTGACA ATACCCGGTA AAAGGCCCGC GACGCTGCGC 
4 7821 CTCGGCGGAT CCGCGAAGAG AAAGAAGAGC GTCACCGCAC AGCGCGGCAG CCCGGTCCTT 
4 7881 TCGTCCTTCG CACAGCGGCG GATCTGGTTT CTCCAGCAAT TGGACCCGGA GAGCAACGCC 
4 7 941 TATAATCTCC CGCTCGTGCA ACGCCTGCGC GGTCTATTGG ACGCGCCGGC CCTGGAGCGT 
4 8001 GCGCTGGCGC . TCGTCGTCGC GCGCCACGAG ■ GCGTTGCGGA CGGTGTTCGA CACCGCCGAC 
4 8061 GGCGAGCCCC TCCAGCGGGT GCTTCCCGCC CCGGAACACC TCCTGCGCCA CGCGCGGGCG- 
4 8121 GGCAGCGAGG AGGACGCCGC CCGGCTCGTC CGCGACGAGA TCGCCGCGCC GTTCGACCTC 
4 8181 GCCACGGGGC CGTTGATCAG GGCCCTGCTG ATCCGCCTCG GTGACGACGA CCACGTTCTC 
4 8241 GCGGTGACCG TGCACCATGT CGCCGGCGAC GGCTGGTCGT TCGGGCTCCT CCAACATGAA 
4 8 301 CTCGCAGCCC ACTACACGGC GCTGCGCGAC ACTGCCCGCC CTGCCGAACT GCCGCCGTTG 
4 8361 CCGGTGCAGT ACGCCGACTT CGCCGCCTGG GAGCGGCGCG AACTCACCGG CGCCGGACTG 
4 84 21 GACAGGCGTC TGGCCTACTG GCGCGAGCAA CTCCGGGGCG CCCCGGCGCG GCTCGCCCTC 
4 8481 CCCACCGACC GTCCCCGCCC GCCGGTCGCC GACGCGGACG CGGGCATGGC CGAGTGGCGG 
4 8541 CCGCCGGCCG CGCTGGCCAC CGCGGTCCTC ACGCTCGCGC GCGACTCCGG TGCGTCCGTG 
4 8 601 TTCATGACCC TGCTGGCGGC CTTCCAAGCG GTCCTCGCCC GGCAGGCGGG CACGCGGGAC 
4 8 661 GTGCTGGTCG GCACGCCCGT GGCGAACCGT ACGCGGGCGG CGTACGAGGG CCTGATCGGC 
4 8721 ATGTTCGTCA ACACGCTCGC GCTGCGCGGC GACCTCTCGG GCGATCCGTC GTTCCGGGAA 
4 8781 CTCCTCGACC GCTGCCGGGC CACGACCACG GACGCGTTCG CCCACGCCGA CCTGCCGTTC 
4 8841 GAGAACGTCA TCGAACTCGT CGCACCGGAA CGCGACCTGT CGGTCAACCC GGTCGTCCAG 
4 8901 GTGCTGTTGC AGGTGCTGCG GCGCGACGCG GCGACGGCCG CGCTGCCCGG CATCGCGGCC 
4 8961 GAACCGTTCC GCACCGGACG CTGGTTCACC CGCTTCGACC TCGAATTCCA TGTGTACGAG 
4 9021 GAGCCGGGTG GCGCGCTGAC CGGCGAACTG CTCTACAGCC GTGCGCTGTT CGACGAGCCA 
4 9081 CGGATCACGG GGTTGCTGGA GGAGTTCACG GCGGTGCTTC AGGCGGTCAC CGCCGACCCG- 
4 9141 GACGTACGGC TGTCGCGGCT GCCGGCCGGC GACGCGACGG CGGCAGCGCC CGTGGTGCCC 
4 9201 TCGAACGACA CGGCGCGGGA CCTGCCCGTC GACACGCTGC CGGGCCTGCT GGCCCGGTAC 
4 92 61 GCCGCACGCA CCCCCGGCGC CGTGGCCGTC ACCGACCCGC ACATCTCCCT CACCTACGCG 
4 9321 CAGCTGGACC GGCGGGCGAA CCGCCTCGCG CACCTGCTCC GCGCGCGCGG CACCGCCACC 
4 9381 GGCGACCTGG TCGGGATCTG CGCCGATCGC GGCGCCGACC TGATCGTCGG CATCGTGGGG 
4 9441 ATCCTCAAGG CGGGCGCCGC TTATGTGCCG CTGGACCCCG AACATCCTCC GGAGCGCACG 
4 9501 GCGTTCGTGC TGGCCGACGC GCAGCTGACC ACGGTGGTGG CGCACGAGGT CTACCGTTCC 
4 9561 CGGTTCCCCG ATGTGCCGCA CGTGGTGGCG TTGGACGACC CGGAGCTGGA CCGGCAGCCG 
4 9621 GACGACACGG CGCCGGACGT CGAGCTGGAC CGGGACAGCC TCGCCTACGC GATCTACACG 
4 9681 TCCGGGTCGA CCGGCAGGCC GAAGGCCGTG CTCATGCCGG GTGTCAGCGC CGTCAACCTG 
4 9741 CTGCTCTGGC AGGAGCGCAC GATGGGCCGC GAGCCGGCCA GCCGCACCGT CCAGTTCGTG 
4 9801 ACGCCCACGT TCGACTACTC GGTGCAGGAG ATCTTTTCCG CGCTGCTGGG CGGCACGCTC 
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4 9861 GTCATCCCGC CGGACGAGGT GCGGTTCGAC CCGCCGGGAC TCGCCCGGTG GATGGACGAA 
4 9921 CAGGCGATTA CCCGGATCTA CGCGCCGACG GCCGTACTGC GCGCGCTGAT CGAGCACGTC 
4 9981 GATCCGCACA GCGACCAGCT CGCCGCCCTG CGGCACCTGT GCCAGGGCGG CGAGGCGCTG 
50041 ATCCTCGACG CGCGGTTGCG CGAGCTGTGC CGGCACCGGC CCCACCTGCG CGTGCACAAT 
50101 CACTACGGTC CGGCCGAAAG CCAGCTCATC ACCGGGTACA CGCTGCCCGC CGACCCCGAC 
50161 GCGTGGCCCG CCACCGCACC GATCGGCCCG CCGATCGACA ACACCCGCAT CCATCTGCTC 
50221 GACGAGGCGA TGCGGCCGGT TCCGGACGGT ATGCCGGGGC AGCTCTGCGT CGCCGGCGTC 
50281 GGCCTCGCCC GTGGGTACCT GGCCCGTCCC GAGCTGACCG CCGAGCGCTG GGTGCCGGGA 
50341 GATGCGGTCG GCGAGGAGCG CATGTACCTC ACCGGCGACC TGGCCCGCCG CGCGCCCGAC 
504 01 GGCGACCTGG AATTCCTCGG CCGGATCGAC GACCAGGTCA AGATCCGCGG CATCCGCGTC 
504 61 GAACCGGGTG AGATCGAGAG CCTGCTCGCC GAGGACGCCC GCGTCACGCA GGCGGCGGTG 
50521 TCCGTGCGCG AGGACCGGCG GGGCGAGAAG TTCCTGGCCG CGTACGTCGT ACCGGTGGCC 
50581 GGCCGGCACG GCGACGACTT CGCCGCGTCG CTGCGCGCGG GACTGGCCGC CCGGCTGCCC 
50641 GCCGCGCTCG TGCCCTCCGC CGTCGTCCTG GTGGAGCGAC TGCCGAGGAC CACGAGCGGC 
50701 AAGGTGGACC GGCGCGCGCT GCCCGACCCG GAGCCGGGCC CGGCGTCGAC CGGGGCGGTT 
50761 ACGCCCCGCA CCGATGCCGA GCGGACGGTG TGCCGGATCT TCCAGGAGGT GCTCGACGTC 
50821 CCGCGGGTCG GTGCCGACGA CGACTTCTTC ACGCTCGGCG GGCACTCCCT GCTCGCCACC 
50881 CGGGTCGTCT CCCGCATCCG CGCCGAGCTG GGTGCCGATG TCCCGCTGCG TACGCTCTTC 
50941 GACGGGCGGA CGCCCGCCGC GCTCGCCCGT GCGGCGGACG AGGCCGGCCC GGCCGCCCTG 
51001 CCCCCGATCG CGCCCTCCGC GGAGAACGGG CCGGCCCCCC TCACCGCGGC ACAGGAACAG 
51061 ATGCTGCACT CGCACGGCTC GCTGCTCGCC GCGCCCTCCT ACACGGTCGC CCCGTACGGG 
51121 TTCCGGCTGC GCGGGCCACT CGACCGCGAA GCGCTCGACG CGGCACTGAC CCGGATCGCC 
51181 GCGCGCCACG AGCCGCTGCG GACCGGGTTC CGCGATCGGG AACAGGTCGT CCGGCCGCCC 
51241 GCTCCGGTGC GCGCCGAGGT GGTTCCGGTG CCGGTCGGCG ACGTCGACGC CGCGGTCCGG 
51301 GTCGCCCACC GGGAGCTGAC CCGGCCGTTG GACCTCGTGA ACGGGTCGTT GCTGCGTGCC 
51361 GTGCTGCTGC CGCTGGGCGC CGAGGATCAC GTGCTGCTGC TGATGCTGCA CCACCTCGCC 
51421 GGTGACGGAT GGTCCTTCGA CCTCCTGGTC CGGGAGTTGT CGGGGACGCA ACCGGACCTT 
514 81 CCGGTGTCCT ACACGGACGT GGCCCGGTGG GAACGGAGTC CGGCCGTGAT CGCGGCCAGG 
51541 GAGAACGACC GGGCCTACTG GCGCCGGCGG CTGGGGGGCG CCACCGCGCC GGAGCTGCCC 
51601 GCGGTCCGGC CCGGCGGGGC ACCGACCGGG CGGGCGTTCC TGTGGACGCT CAAGGACACC 
51661 GCCGTCCTGG CGGCACGCCG GGTCGCGGAC GCCCACGACG CGACGTTGCA CGAAACCGTG 
51721 CTCGGCGCCT TCGCCCTGGT CGTGGCGGAG ACCGCCGACA CCGACGACGT GCTCGTCGCG 
51781 ACGCCGTTCG CGGACCGGGG GTACGCCGGG ACCGACCACC TCATCGGCTT CTTCGCGAAG 
51841 GTCCTCGCGC TGCGCCTCGA CCTCGGCGGC ACGCCGTCGT TCCCCGAGGT GCTGCGCCGG 
51901 GTGCACACCG CGATGGTGGG CGCGCACGCC CACCAGGCGG TGCCCTACTC CGCGCTGCGC 
51961 GCCGAGGACC CCGCGCTGCC GCCGGCCCCC GTGTCGTTCC AGCTCATCAG CGCGCTCAGC 
52021 GCGGAACTGC GGCTGCCCGG CATGCACACC GAGCCGTTCC CCGTCGTCGC CGAGACCGTC 
52081 GACGAGATGA CCGGCGAACT GTCGATCAAC CTCTTCGACG ACGGTCGCAC CGTCTCCGGC 
52141 GCGGTGGTCC ACGATGCCGC GCTGCTCGAC CGTGCCACCG TCGACGATTT GCTCACCCGG 
52201 GTGGAGGCGA CGCTGCGTGC CGCCGCGGGC GACCTCACCG TACGCGTCAC CGGTTACGTG 
52261 GAAAGCGAGT AGCCATGCCC GAGCAGGACA AGACAGTCGA' GTACCTTCGC TGGGCGACCG 
52321 CGGAACTCCA GAAGACCCGT GCGGAACTCG CCGCGCACAG CGAGCCGTTG GCGATCGTGG 
52381 GGATGGCCTG CCGGCTGCCC GGCGGGGTCG CGTCGCCGGA GGACCTGTGG CAGTTGCTGG 
524 41 AGTCCGGTGG CGACGGCATC ACCGCGTTCC CCACGGACCG GGGCTGGGAG ACCACCGCCG 
52501 ACGGTCGCGG CGGCTTCCTC ACCGGGGCGG CCGGCTTCGA CGCGGCGTTC TTCGGCATCA 
52561 GCCCGCGCGA GGCGCTGGCG ATGGACCCGC AGCAGCGCCT GGCCCTGGAG ACCTCGTGGG 
52621 AGGCGTTCGA GCACGCGGGC ATCGATCCGC AGACGCTGCG GGGCAGTGAC ACGGGGGTGT 
52 681 TCCTCGGCGC GTTCTTCCAG GGGTACGGCA TCGGCGCCGA CTTCGACGGT TACGGCACCA 
527 41 CGAGCATTCA CACGAGCGTG CTCTCCGGCC GCCTCGCGTA CTTCTACGGT CTGGAGGGTC 
52801 CGGCGGTCAC GGTCGACACG GCGTGTTCGT CGTCGCTGGT GGCGCTGCAC CAGGCCGGGC 
52861 AGTCGCTGCG CTCCGGCGAA TGCTCGCTCG CCCTGGTCGG CGGCGTCACG GTGATGGCCT 
52 921 CGCCGGCGGG GTTCGCGGAC TTCTCCGAGC AGGGCGGCCT GGCCCCCGAC GCGCGCTGCA 
52981 AGGCCTTCGC GGAAGCGGCT GACGGCACCG GTTTCGCCGA GGGGTCCGGC GTCCTGATCG 
53041 TCGAGAAGCT CTCCGACGCC GAGCGCAACG GCCACCGCGT GCTGGCGGTC GTCCGGGGTT 
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53101 CCGCCGTCAA CCAGGACGGT GCCTCCAACG GGCTGTCCGC GCCGAACGGG CCGTCGCAGG 

53161 AGCGGGTGAT CCGGCAGGCC CTGGCCAACG CCGGACTCAC CCCGGCGGAC GTGGACGCCG 

53221 TCGAGGCCCA CGGCACCGGC ACCAGGCTGG GCGACCCCAT CGAGGCACAG GCCGTGCTGG 

53281 CCACCTACGG GCAGGGGCGC GACACCCCTG TGCTGCTGGG CTCGCTGAAG TCCAACATCG 

5 53341 GCCACACCCA GGCCGCCGCG GGCGTCGCCG GTGTCATCAA GATGGTCCTC GCCATGCGGC 

534 01 ACGGCACCCT GCCCCGCACC CTGCACGTGG ACACGCCGTC CTCGCACGTC GACTGGACGG 

534 61 CCGGCGCCGT CGAACTCCTC ACCGACGCCC GGCCCTGGCC CGAAACCGAC CGCCCACGGC 

53521 GCGCCGGTGT CTCCTCCTTC GGCGTCAGCG GCACCAACGC CCACATCATC CTCGAAAGCC 

53581 ACCCCCGACC GGCCCCCGAA CCCGCCCCGG CACCCGACAC CGGACCGCTG CCGCTGCTGC 

10 53641 TCTCGGCCCG CACCCCGCAG GCACTCGACG CACAGGTACA CCGCCTGCGC GCGTTCCTCG 

537 01 ACGACAACCC CGGCGCGGAC CGGGTCGCCG TCGCGCAGAC ACTCGCCCGG CGCACCCAGT 

537 61 TCGAGCACCG CGCCGTGCTG CTCGGCGACA CGCTCATCAC CGTGAGCCCG AACGCCGGCC 

53821 GCGGACCGGT GGTCTTCGTC TACTCGGGGC AAAGCACGCT GCACCCGCAC ACCGGGCGGC 

53881 AACTCGCGTC CACCTACCCC GTGTTCGCCG AAGCGTGGCG CGAGGCCCTC GACCACCTCG 

15 53941 ACCCCACCCA GGGCCCGGCC ACGCACTTCG CCCACCAGAC CGCGCTCACC GCGCTCCTGC 

54 001 GGTCCTGGGG CATCACCCCG CACGCGGTCA TCGGCCACTC CCTCGGTGAG ATCACCGCCG 

54 061 CGCACGCCGC CGGTGTCCTG TCCCTGAGGG ACGCGGGCGC GCTCCTCACC ACCCGCACCC 

54121 GCCTGATGGA CCAACTGCCG TCGGGCGGCG CGATGGTCAC CGTCCTGACC AGCGAGGAAA 

54181 AGGCACGCCA GGTGCTGCGG CCGGGCGTGG AGATCGCCGC CGTCAACGGC CCCCACTCCC 

G 20 54 241 TCGTGCTGTC CGGGGACGAG GAAGCCGTAC TCGAAGCCGC CCGGCAGCTC GGCATCCACC 

ul 54 301 ACCGCCTGCC GACCCGCCAC GCCGGCCACT CCGAGCGCAT GCAGCCACTC GTCGCCCCCC 

IM 54 361 TCCTCGACGT CGCCCGGACC CTGACGTACC ACCAGCCCCA CACCGCCATC CCCGGCGACC 

fi| 544 21 CCACCACCGC CGAATACTGG GCGCACCAGG TCCGCGACCA AGTACGTTTC CAGGCGCACA 

y| 54 4 81 CCGAGCAGTA CCCGGGCGCG ACGTTCCTCG AGATCGGCCC CAACCAGGAC CTCTCGCCGC 

25 54 541 TCGTCGACGG CGTTGCCGCC CAGACCGGTA CGCCCGACGA GGTGCGGGCG CTGCACACCG 

III 54 601 CGCTCGCGCA GCTCCACGTC CGCGGCGTCG CGATCGACTG GACGCTCGTC CTCGGCGGGG 

54 661 ACCGCGCGCC CGTCACGCTG CCCACGTATC CGTTCCAGCA CAAGGACTAC TGGCTGCGGC 

54 721 CCACCTCCCG GGCCGATGTG ACCGGCGCGG GGCAGGAGCA GGTGGCGCAC CCGCTGCTCG 
" w 54781 GCGCCGCGGT ' CGCGCTGCCC GGCACGGGCG GAGTCGTCCT GACCGGCCGC CTGTCGCTGG 
D 30 54 841 CCTCCCATCC GTGGCTCGGC GAGCACGCGG TCGACGGCAC CGTGCTCCTG CCCGGCGCGG 
«p 54 901 CCTTCCTCGA ACTCGCGGCG CGCGCCGGCG ACGAGGTCGG CTGCGACCTG CTGCACGAAC 
O 54 961 TCGTCATCGA GACGCCGCTC GTGCTGCCCG CGACCGGCGG TGTGGCGGTC TCCGTCGAGA 
[j 55021 TCGCCGAACC CGACGACACG GGGCGGCGGG CGGTCACCGT CCACGCGCGG GCCGACGGCT 

55081 CGGGCCTGTG GACCCGACAC GCCGGCGGAT TCCTCGGCAC GGCACCGGCA CCGGCCACGG 

35 55141 CCACGGACCC GGCACCCTGG CCGCCCGCGG AAGCCGGACC GGTCGACGTC GCCGACGTCT 

55201 ACGACCGGTT CGAGGACATC GGGTACTCCT ACGGACCGGG CTTCCGGGGG CTGCGGGCCG 

552 61 CCTGGCGCGC CGGCGACACC GTGTACGCCG AGGTCGCGCT CCCCGACGAG CAGAGCGCCG 

55321 ACGCCGCCCG TTTCACGCTG CACCCCGCGC TGCTCGACGC CGCGTTCCAG GCCGGCGCGC 

55381 TGGCCGCGCT CGACGCACCC GGCGGGGCGG CCCGACTGCC GTTCTCGTTC CAGGACGTCC 

40 554 41 GCATCCACGC GGCCGGGGCG ACGCGGCTGC GGGTCACGGT CGGCCGCGAC GGCGAGCGCA 

55501 GCACCGTCCG CATGACCGGC CCGGACGGGC AGCTGGTGGC CGTGGTCGGT GCCGTGCTGT 

55561 CGCGCCCGTA CGCGGAAGGC TCCGGTGACG GCCTGCTGCG CCCGGTCTGG ACCGAGCTGC 

55621 CGATGCCCGT CCCGTCCGCG GACGATCCGC GCGTGGAGGT CCTCGGCGCC GACCCGGGCG 

55681 ACGGCGACGT TCCGGCGGCC ACCCGGGAGC TGACCGCCCG CGTCCTCGGC GCGCTCCAGC 

45 557 41 GCCACCTGTC CGCCGCCGAG GACACCACCT TGGTGGTACG GACCGGCACC GGCCCGGCCG 

558 01 CTGCCGCCGC CGCGGGTCTG GTCCGCTCGG CGCAGGCGGA GAACCCCGGC CGCGTCGTGC 

558 61 TCGTCGAGGC GTCCCCGGAC ACCTCGGTGG AGCTGCTCGC CGCGTGCGCC GCGCTGGACG 

55921 AACCGCAGCT GGCCGTCCGG GACGGCGTGC TCTTCGCGCC GCGGCTGGTC CGGATGTCCG 

55 981 ACCCCGCGCA CGGCCCGCTG TCCCTGCCGG ACGGCGACTG GCTGCTCACC CGGTCCGCCT 
50 56041 CCGGCACGTT GCACGACGTC GCGCTCATAG CCGACGACAC GCCCCGGCGG GCGCTCGAAG 

56101 CCGGCGAGGT CCGCATCGAC GTCCGCGCGG CCGGACTGAA CTTCCGCGAT GTGCTGATCG 

56161 CGCTCGGGAC GTACACCGGG GCCACGGCCA TGGGCGGCGA GGCCGCGGGC GTCGTGGTGG 

56221 AGACCGGGCC CGGCGTGGAC GACCTGTCCC CCGGCGACCG GGTGTTCGGC CTGACCCGGG 

56281 GCGGCATCGG CCCGACGGCC GTCACCGACC GGCGCTGGCT GGCCCGGATC GCCGACGGCT 
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56341 GGAGCTTCAC CACGGCGGCG TCCGTCCCGA TCGTGTTCGC GACCGCGTGG TACGGCCTGG 
56401 TCGACCTCGG CACACTGCGC GCCGGCGAGA AGGTCCTCGT CCACGCGGCC ACCGGCGGTG 
564 61 TCGGCATGGC CGCCGCACAG ATCGCCCGCC ACCTGGGCGC CGAGCTCTAC GCCACCGCCA 
56521 GTACCGGCAA GCAGCACGTC CTGCGCGCCG CCGGGCTGCC CGACACGCAC ATCGCCGACT 
56581 CTCGGACGAC CGCGTTCCGG ACCGCTTTCC CGCGCATGGA CGTCGTCCTG AACGCGCTGA 
56641 CCGGCGAGTT CATCGACGCG TCGCTCGACC TGCTGGACGC CGACGGCCGG TTCGTCGAGA 
56701 TGGGCCGCAC CGAGCTGCGC GACCCGGCCG CGATCGTCCC CGCCTACCTG CCGTTCGACC 
56761 TGCTGGACGC GGGCGCCGAC CGCATCGGCG AGATCCTGGG CGAACTGCTC CGGCTGTTCG 
56821 ACGCGGGCGC GCTGGAGCCG CTGCCGGTCC GTGCCTGGGA CGTCCGGCAG GCACGCGACG 
56881 CGCTCGGCTG GATGAGCCGC GCCCGCCACA TCGGCAAGAA CGTCCTGACG CTGCCCCGGC 
56941 CGCTCGACCC GGAGGGCGCC GTCGTCCTCA CCGGCGGCTC CGGCACGCTC GCCGGCATCC 
57001 TCGCCCGCCA CCTGCGCGAA CGGCATGTCT ACCTGCTGTC CCGGACGGCA CCGCCCGAGG 
57 061 GGACGCCCGG CGTCCACCTG CCCTGCGACG TCGGTGACCG GGACCAGCTG GCGGCGGCCC 
57121 TGGAGCGGGT GGACCGGCCG ATCACCGCCG TGGTGCACCT CGCCGGTGCG CTGGACGACG 
57181 GCACCGTCGC GTCGCTCACC CCCGAGCGTT TCGACACGGT GCTGCGCCCG AAGGCCGACG 
57241 GCGCCTGGTA CCTGCACGAG CTGACGAAGG AGCAGGACCT CGCCGCGTTC GTGCTCTACT 
57301 CGTCGGCCGC CGGCGTGCTC GGCAACGCCG GCCAGGGCAA CTACGTCGCC GCGAACGCGT 
57361 TCCTCGACGC GCTCGCCGAG CTGCGCCACG GTTCCGGGCT GCCGGCCCTC TCCATCGCCT 
57 4 21 GGGGGCTCTG GGAGGACGTG AGCGGGCTCA CCGCGGCGCT CGGCGAAGCC GACCGGGACC 
574 81 GGATGCGGCG CAGCGGTTTC CGGGCCATCA CCGCGCAACA GGGCATGCAC CTGTACGAGG 
57 541 CGGCCGGCCG CACCGGAAGT CCCGTGGTGG TCGCGGCGGC GCTCGACGAC GCGCCGGACG 
57 601 TGCCGCTGCT GCGCGGCCTG CGGCGGACGA CCGTCCGGCG GGCCGCCGTC CGGGAGTGTT 
57 661 CGTCCGCCGA CCGGCTCGCC GCGCTGACCG GCGACGAGCT CGCCGAAGCG CTGCTGACGC 
57 721 TCGTCCGGGA GAGCACCGCC GCCGTGCTCG GCCACGTGGG TGGCGAGGAC ATCCCCGCGA 
57 781 CGGCGGCGTT CAAGGACCTC GGCATCGACT CGCTCACCGC GGTCCAGCTG CGCAACGCCC 
57841 TCACCGAGGC GACCGGTGTG CGGCTGAACG CCACGGCGGT CTTCGACTTC CCGACCCCGC 
57 901 ACGTGCTCGC CGGGAAGCTC GGCG ACGAAC TGACCGGCAC CCGCGCGCCC GTCGTGCCCC 

57 961 GGACCGCGGC CACGGCCGGT GCGCACGACG AGCCGCTGGC GATCGTGGGA ATGGCCTGCC 
58021 GGCTGCCCGG CGGGGTCGCG TCACCCGAGG AGCTGTGGCA CCTCGTGGCA TCCGGCACCG 
58081 ACGCCATCAC GGAGTTCCCG ACGGACCGCG GCTGGGACGT CGACGCGATC TACGACCCGG 
58141 ACCCCGACGC GATCGGCAAG ACCTTCGTCC GGCACGGTGG CTTCCTCACC GGCGCGACAG 
58201 GCTTCGACGC GGCGTTCTTC GGCATCAGCC CGCGCGAGGC CCTCGCGATG GACCCGCAGC 
58261 AGCGGGTGCT CCTGGAGACG TCGTGGGAGG CGTTCGAAAG CGCCGGCATC ACCCCGGACT 
58321 CGACCCGCGG CAGCGACACC GGCGTGTTCG TCGGCGCCTT CTCCTACGGT TACGGCACCG 
58381 GTGCGGACAC CGACGGCTTC GGCGCGACCG GCTCGCAGAC CAGTGTGCTC TCCGGCCGGC 
584 41 TGTCGTACTT CTACGGTCTG GAGGGTCCGG CGGTCACGGT CGACACGGCG TGTTCGTCGT 
58501 CGCTGGTGGC GCTGCACCAG GCCGGGCAGT CGCTGCGCTC CGGCGAATGC TCGCTCGCCC 
58561 TGGTCGGCGG CGTCACGGTG ATGGCGTCTC CCGGCGGCTT CGTGGAGTTC TCCCGGCAGC 

58 621 GCGGCCTCGC GCCGGACGGC CGGGCGAAGG CGTTCGGCGC GGGTGCGGAC GGCACGAGCT 
58 681 TCGCCGAGGG TGCCGGTGTG CTGATCGTCG AGAGGCTCTC CGACGCCGAA CGCAACGGTC 
58741 ACACCGTCCT GGCGGTCGTC CGTGGTTCGG CGGTCAACCA GGATGGTGCC TCCAACGGGC 
58801 TGTCGGCGCC GAACGGGCCG TCGCAGGAGC GGGTGATCCG GCAGGCCCTG GCCAACGCCG 
58861 GGCTCACCCC GGCGGACGTG GACGCCGTCG AGGCCCACGG CACCGGCACC AGGCTGGGCG 
58921 ACCCCATCGA GGCACAGGCG GTACTGGCCA CCTACGGACA GGAGCGCGCC ACCCCCCTGC 
58 981 TGCTGGGCTC GCTGAAGTCC AACATCGGCC ACGCCCAGGC CGCGTCCGGC GTCGCCGGCA 
59041 TCATCAAGAT GGTGCAGGCC CTCCGGCACG GGGAGCTGCC GCCGACGCTG CACGCCGACG 
59101 AGCCGTCGCC GCACGTCGAC TGGACGGCCG GCGCCGTCGA ACTGCTGACG TCGGCCCGGC 
59161 CGTGGCCCGA GACCGACCGG CCACGGCGTG CCGCCGTCTC CTCGTTCGGG GTGAGCGGCA 
59221 CCAACGCCCA CGTCATCCTG GAGGCCGGAC CGGTAACGGA GACGCCCGCG GCATCGCCTT 
59281 CCGGTGACCT TCCCCTGCTG GTGTCGGCAC GCTCACCGGA AGCGCTCGAC GAGCAGATCC 
59341 GCCGACTGCG CGCCTACCTG GACACCACCC CGGACGTCGA CCGGGTGGCC GTGGCACAGA 
59401 CGCTGGCCCG GCGCACACAC TTCGCCCACC GCGCCGTGCT GCTCGGTGAC ACCGTCATCA 
59461 CCACACCCCC CGCGGACCGG CCCGACGAAC TCGTCTTCGT CTACTCCGGC CAGGGCACCC 
59521 AGCATCCCGC GATGGGCGAG CAGCTCGCCG CCGCCCATCC CGTGTTCGCC GACGCCTGGC 
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59581 ATGAAGCGCT CCGCCGCCTT GACAACCCCG ACCCCCACGA CCCCACGCAC AGCCAGCATG 
59641 TGCTCTTCGC CCACCAGGCG GCGTTCACCG CCCTCCTGCG GTCCTGGGGC ATCACCCCGC 
59701 ACGCGGTCAT CGGCCACTCG CTGGGCGAGA TCACCGCGGC GCACGCCGCC GGCATCCTGT 
597 61 CGCTGGACGA CGCGTGCACC CTGATCACCA CGCGCGCCCG CCTCATGCAC ACGCTCCCGC 
59821 CACCCGGTGC CATGGTCACC GTACTGACCA GCGAAGAGAA GGCACGCCAG GCGTTGCGGC 
59881 CGGGCGTGGA GATCGCCGCC GTCAACGGGC CCCACTCCAT CGTGCTGTCC GGGGACGAGG 
59941 ACGCCGTGCT CACCGTCGCC GGGCAGCTCG GCATCCACCA CCGCCTGCCC GCCCCGCACG 
60001 CCGGGCACTC CGCGCACATG GAGCCCGTGG CCGCCGAGCT GCTCGCCACC ACCCGCGGGC 
60061 TCCGCTACCA CCCTCCCCAC ACCTCCATTC CGAACGACCC CACCACCGCT GAGTACTGGG 
60121 CCGAGCAGGT CCGCAAGCCC GTGCTGTTCC ACGCCCACGC GCAGCAGTAC CCGGACGCCG 
60181 TGTTCGTGGA GATCGGCCCC GCCCAGGACC TCTCCCCGCT CGTCGACGGG ATCCCGCTGC 
602 41 AGAACGGCAC CGCGGACGAG GTGCACGCGC TGCACACCGC GCTCGCGCAC CTCTACGCGC 
60301 GCGGTGCCAC GCTCGACTGG CCCCGCATCC TCGGGGCTGG GTCACGGCAC GACGCGGATG 
60361 TGCCCGCGTA CGCGTTCCAA CGGCGGCACT ACTGGATCGA GTCGGCACGC CCGGCCGCAT 
604 21 CCGACGCGGG CCACCCCGTG CTGGGCTCCG GTATCGCCCT CGCCGGGTCG CCGGGCCGGG 
604 81 TGTTCACGGG TTCCGTGCCG ACCGGTGCGG ACCGCGCGGT GTTCGTCGCC GAGCTGGCGC 
60541 TGGCCGCCGC GGACGCGGTC GACTGCGCCA CGGTCGAGCG GCTCGACATC GCCTCCGTGC 
60601 CCGGCCGGCC GGGCCATGGC CGGACGACCG TACAGACCTG GGTCGACGAG CCGGCGGACG 
60661 ACGGCCGGCG CCGGTTCACC GTGCACACCC GCACCGGCGA CGCCCCGTGG ACGCTGCACG 
607 21 CCGAGGGGGT GCTGCGCCCC CATGGCACGG CCCTGCCCGA TGCGGCCGAC GCCGAGTGGC 
607 81 CCCCACCGGG CGCGGTGCCC GCGGACGGGC TGCCGGGTGT GTGGCGCCGG GGGGACCAGG 
60841 TCTTCGCCGA GGCCGAGGTG GACGGACCGG ACGGTTTCGT GGTGCACCCC GACCTGCTCG 
60901 ACGCGGTCTT CTCCGCGGTC GGCGACGGAA GCCGCCAGCC GGCCGGATGG CGCGACCTGA 
60961 CGGTGCACGC GTCGGACGCC ACCGTACTGC GCGCCTGCCT CACCCGGCGC ACCGACGGAG 
61021 CCATGGGATT CGCCGCCTTC GACGGCGCCG GCCTGCCGGT ACTCACCGCG GAGGCGGTGA 
61081 CGCTGCGGGA GGTGGCGTCA CCGTCCGGCT CCGAGGAGTC GGACGGCCTG CACCGGTTGG 
61141 AGTGGCTCGC GGTCGCCGAG GCGGTCTACG ACGGTGACCT GCCCGAGGGA CATGTCCTGA 
61201 TCACCGCCGC CCACCCCGAC GACCCCGAGG ACATACCCAC CCGCGCCCAC ACCCGCGCCA 
61261 CCCGCGTCCT GACCGCCCTG CAACACCACC TCACCACCAC CGACCACACC CTCATCGTCC 
61321 ACACCACCAC CGACCCCGCC GGCGCCACCG TCACCGGCCT CACCCGCACC GCCCAGAACG 
61381 AACACCCCCA CCGCATCCGC CTCATCGAAA CCGACCACCC CCACACCCCC CTCCCCCTGG 
614 41 CCCAACTCGC CACCCTCGAC CACCCCCACC TCCGCCTCAC CCACCACACC CTCCACCACC 
61501 CCCACCTCAC CCCCCTCCAC ACCACCACCC CACCCACCAC CACCCCCCTC AACCCCGAAC 
61561 ACGCCATCAT CATCACCGGC GGCTCCGGCA CCCTCGCCGG CATCCTCGCC CGCCACCTGA 
61621 ACCACCCCCA CACCTACCTC CTCTCCCGCA CCCCACCCCC CGACGCCACC CCCGGCACCC 
61681 ACCTCCCCTG CGACGTCGGC GACCCCCACC AACTCGCCAC CACCCTCACC CACATCCCCC 

617 41 AACCCCTCAC CGCCATCTTC CACACCGCCG CCACCCTCGA CGACGGCATC CTCCACGCCC 
61801 TCACCCCCGA CCGCCTCACC ACCGTCCTCC ACCCCAAAGC CAACGCCGCC TGGCACCTGC 

618 61 ACCACCTCAC CCAAAACCAA CCCCTCACCC ACTTCGTCCT CTACTCCAGC GCCGCCGCCG 
61921 TCCTCGGCAG CCCCGGACAA GGAAACTACG CCGCCGCCAA CGCCTTCCTC GACGCCCTCG 
61981 CCACCCACCG CCACACCCTC GGCCAACCCG CCACCTCCAT CGCCTGGGGC ATGTGGCACA 
62041 CCACCAGCAC CCTCACCGGA CAACTCGACG ACGCCGACCG GGACCGCATC CGCCGCGGCG 
62101 GTTTCCTCCC GATCACGGAC GACGAGGGCA TGCGCCTCTA CGAGGCGGCC GTCGGCTCCG 
62161 GCGAGGACTT CGTCATGGCC GCCGCGATGG ACCCGGCACA GCCGATGACC GGCTCCGTAC 
62221 CGCCCATCCT GAGCGGCCTG CGCAGGAGCG CGCGGCGCGT CGCCCGTGCC GGGCAGACGT 
62281 TCGCCCAGCG GCTCGCCGAG CTGCCCGACG CCGACCGCGG CGCGGCGCTG ACCACCCTCG 
62341 TCTCGGACGC CACGGCCGCC GTGCTCGGCC ACGCCGACGC CTCCGAGATC GCGCCGACCA 
62401 CGACGTTCAA GGACCTCGGC ATCGACTCGC TCACCGCGAT CGAGCTGCGC AACCGGCTCG 
624 61 CGGAGGCGAC CGGGCTGCGG CTGAGTGCCA CGCTGGTGTT CGACCACCCG ACACCTCGGG 
62521 TCCTCGCCGC CAAGCTCCGC ACCGATCTGT TCGGCACGGC CGTGCCCACG CCCGCGCGGA 
62581 CGGCACGGAC CCACCACGAC GAGCCACTCG CGATCGTCGG CATGGCGTGC CGACTGCCCG 
62 641 GCGGGGTCGC CTCGCCGGAG GACCTGTGGC AGCTCGTGGC GTCCGGCACC GACGCGATCA 
627 01 CCGAGTTCCC CACCGACCGC GGCTGGGACA TCGACCGGCT GTTCGACCCG GACCCGGACG 
627 61 CCCCCGGCAA GACCTACGTC CGGCACGGCG GCTTCCTCGC CGAGGCCGCC GGCTTCGATG 
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62821 CCGCGTTCTT CGGCATCAGC CCGCGCGAGG CACGGGCCAT GGACCCGCAG CAGCGCGTCA 
62881 TCCTCGAAAC CTCCTGGGAG GCGTTCGAGA ACGCGGGCAT CGTGCCGGAC ACGCTGCGCG 
62941 GCAGCGACAC CGGCGTGTTC ATGGGCGCGT TCTCCCATGG GTACGGCGCC GGCGTCGACC 
63001 TGGGCGGGTT CGGCGCCACC GCCACGCAGA ACAGCGTGCT CTCCGGCCGG TTGTCGTACT 
63061 TCTTCGGCAT GGAGGGCCCG GCCGTCACCG TCGACACCGC CTGCTCGTCG TCGCTGGTCG 
63121 CCCTGCACCA GGCGGCACAG GCGCTGCGGA CTGGAGAATG CTCGCTGGCG CTCGCCGGCG 
63181 GTGTCACGGT GATGCCCACC CCGCTGGGCT ACGTCGAGTT CTGCCGCCAG CGGGGACTCG 
63241 CCCCCGACGG CCGTTGCCAG GCCTTCGCGG AAGGCGCCGA CGGCACGAGC TTCTCGGAGG 
63301 GCGCCGGCGT TCTTGTGCTG GAGCGGCTCT CCGACGCCGA GCGCAACGGA CACACCGTCC 
63361 TCGCGGTCGT CCGCTCCTCC GCCGTCAACC AGGACGGCGC CTCCAACGGC ATCTCCGCAC 
634 21 CCAACGGCCC CTCCCAGCAG CGCGTCATCC GCCAGGCCCT CGACAAGGCC GGGCTCGCCC 
634 81 CCGCCGACGT GGACGTGGTG GAGGCCCACG GCACCGGAAC CCCGCTGGGC GACCCGATCG 
63541 AGGCACAGGC CATCATCGCG ACCTACGGCC AGGACCGCGA CACACCGCTC TACCTCGGTT 
63601 CGGTCAAGTC GAACATCGGA CACACCCAGA CCACCGCCGG TGTCGCCGGC GTCATCAAGA 
63661 TGGTCATGGC GATGCGCCAC GGCATCGCGC CGAAGACACT GCACGTGGAC GAGCCGTCGT 
63721 CGCATGTGGA CTGGACCGAG GGTGCGGTGG AACTGCTCAC CGAGGCGAGG CCGTGGCCCG 
63781 ACGCGGGACG CCCGCGCCGC GCGGGCGTGT CGTCGCTCGG TATCAGCGGT ACGAACGCCC 
638 41 ACGTGATCCT TGAGGGTGTT CCCGGGCCGT CGCGTGTGGA GCCGTCTGTT GACGGGTTGG 
63901 TGCCGTTGCC GGTGTCGGCT CGGAGTGAGG CGAGTCTGCG GGGGCAGGTG GAGCGGCTGG 
63961 AGGGGTATCT GCGCGGGAGT GTGGATGTGG CCGCGGTCGC GCAGGGGTTG GTGCGTGAGC 
64 021 GTGCTGTCTT CGGTCACCGT GCGGTACTGC TGGGTGATGC CCGGGTGATG GGTGTGGCGG 
64 081 TGGATCAGCC GCGTACGGTG TTCGTCTTTC CCGGGCAGGG TGCTCAGTGG GTGGGCATGG 
64141 GTGTGGAGTT GATGGACCGT TCTGCGGTGT TCGCGGCTCG TATGGAGGAG TGTGCGCGGG 
64 201 CGTTGTTGCC GCACACGGGC TGGGATGTGC GGGAGATGTT GGCGCGGCCG GATGTGGCGG 
64261 AGCGGGTGGA GGTGGTCCAG CCGGCCAGCT GGGCGGTCGC GGTCAGCCTG GCCGCACTGT 
64 321 GGCAGGCCCA CGGGGTCGTA CCCGACGCGG TGATCGGACA CTCCCAGGGC GAGATCGCGG 
64 381 CGGCGTGCGT GGCCGGGGCC CTCAGCCTTG AGGACGCCGC CCGCGTGGTG GCCTTGCGCA 
64441 GCCAGGTCAT CGCGGCGCGA CTGGCCGGGC GGGGAGCGAT GGCTTCGGTG GCATTGCCGG 
64501 CCGGTGAGGT CGGTCTGGTC GAGGGCGTGT GGATCGCGGC GCGTAACGGC CCCGCCTCGA 
64561 CAGTCGTGGC CGGCGAGCCG TCGGCGGTGG AGGACGTGGT GACGCGGTAT GAGACCGAAG 
64 621 GCGTGCGAGT GCGTCGTATC GCCGTCGACT ACGCCTCCCA CACGCCCCAC GTGGAAGCCA 
64 681 TCGAGGACGA ACTCGCTGAG GTACTGAAGG GAGTTGCAGG GAAGGCCGCG TCGGTGGCGT 
64741 GGTGGTCGAC CGTGGACAGC GCCTGGGTGA CCGAGCCGGT GGATGAGAGT TACTGGTACC 
64 801 GGAACCTGCG TCGCCCCGTC GCGCTGGACG CGGCGGTGGC GGAGCTGGAC GGGTCCGTGT 
64 8 61 TCGTGGAGTG CAGCGCCCAT CCGGTGCTGC TGCCGGCGAT GGAACAGGCC CACACGGTGG 
64 921 CGTCGTTGCG CACCGGTGAC GGCGGCTGGG AGCGATGGCT GACGGCGTTG GCGCAGGCGT 
64 981 GGACCCTGGG CGCGGCAGTG GACTGGGACA CGGTGGTCGA ACCGGTGCCA GGGCGGCTGC 
65041 TCGATCTGCC CACCTACGCG TTCGAGCGCC GGCGCTACTG GCTGGAAGCG GCCGGTGCCA 
65101 CCGACCTGTC CGCGGCCGGG CTGACAGGGG CAGCACATCC CATGCTGGCC GCCATCACGG 
65161 CACTACCCGC CGACGACGGT GGTGTTGTTC TCACCGGCCG GATCTCGTTG CGCACGCATC 
65221 CCTGGCTGGC TGATCACGCG GTGCGGGGCA CGGTCCTGCT GCCGGGCACG GCCTTTGTGG 
65281 AGCTGGTCAT CCGGGCCGGT GACGAGACCG GTTGCGGGAT AGTGGATGAA CTGGTCATCG 
65341 AATCCCCCCT CGTGGTGCCG GCGACCGCAG CCGTGGATCT GTCGGTGACC GTGGAAGGAG 
654 01 CTGACGAGGC CGGACGGCGG CGAGTGACCG TCCACGCCCG CACCGAAGGC ACCGGCAGCT 
654 61 GGACCCGGCA CGCCAGCGGC ACCCTGACCC CCGACACCCC CGACACCCCC AACGCTTCCG 
65521 GTGTTGTCGG TGCGGAGCCG TTCTCGCAGT GGCCACCTGC CACTGCCGCG GCCGTCGACA 
65581 CCTCGGAGTT CTACTTGCGC CTGGACGCGC TGGGCTACCG GTTCGGACCC ATGTTCCGCG 
65641 GAATGCGGGC TGCCTGGCGT GATGGTGACA CCGTGTACGC CGAGGTCGCG CTCCCCGAGG 
65701 ACCGTGCCGC CGACGCGGAC GGTTTCGGCA TGCACCCGGC GCTGCTCGAC GCGGCCTTGC 
657 61 AGAGCGGCAG CCTGCTCATG CTGGAATCGG ACGGCGAGCA GAGCGTGCAA CTGCCGTTCT 
65821 CCTGGCACGG CGTCCGGTTC CACGCGACGG GCGCGACCAT GCTGCGGGTG GCGGTCGTAC 
65881 CGGGCCCGGA CGGCCTCCGG CTGCATGCCG CGGACAGCGG GAACCGTCCC GTCGCGACGA 
65941 TCGACGCGCT CGTGACCCGG TCCCCGGAAG CGGACCTCGC GCCCGCCGAT CCGATGCTGC 
66001 GGGTCGGGTG GGCCCCGGTG CCCGTACCTG CCGGGGCCGG ' TCCGTCCGAC GCGGACGTGC 
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66061 TGACGCTGCG CGGCGACGAC GCCGACCCGC 

66121 TTCTCGACGC GCTGCTCCGG GCCGACCGGC 

66181 CCGCCAAGGC GGCCGCAGGC CTGGTCCGCA 

66241 TCCTCGTCGA AACGGACCCG GGAGAGGTCC 

66301 CACTCGGCGA GCCCCATGTG CGGCTGCGCG 

66361 GGGCCACGCC GTCCCTGACG CTCCCGGACA 

66421 CCGGTTCCCT CGACGACCTT GCCGTCGTCC 

66481 CCGGCGAGGT GCGGATCGCG GTACGCGCGG 

66541 CGCTCGGTGT GGTCGCCGAT GCGCGTCCGC 

66601 AGACCGGCCC CGGTGTGCAC GACCTGGCGC 

66661 GCGCCTTCGG ACCGGTCGCG ATCACCGACC 

66721 GGACGTTCCC GCAGGCGGCG TCCGTGATGA 

66781 TCGACCTGGC CGGGCTGCGC CCCGGCGAGA 

66841 TCGGCGCGGC GGCCGTCCAG ATCGCGCGGC 

66901 GCGCCGCGAA GCGCCATCTG GTGGACCTGG 

66961 CCGCGTTCGC CGACGCGTTC CCGCCGGTCG 

67021 TCCTCGACGC GTCCGTCGGC CTGCTCGCGG 

67081 CGGACATCCG GCACGCCGTC CAGCAGCCGT 

67141 TGCAGCGGAT CATCGTCGAG CTGCTCGGCC 

67201 CGGTCCACGC CTGGGACGTG CGGCAGGCGC 

672 61 GTCACACCGG CAAGCTGGTG CTGACGGTCC 

67321 TCATCACCGG CGGCTCCGGC ACCCTCGCCG 

67381 ACACCTACCT GCTCTCCCGC ACCCCACCCC 

674 41 GCGACGTCGG CGACCCCCAC CAACTCGCCA 

67501 CCGCCGTCTT CCACACCGCC GGAACCCTCG 

67561 ACCGCGTCGA CACCGTCCTC AAACCCAAGG 

67 621 CCCGCGACAC CGACCTCGCC GCGTTCGTCG 

67 681 GCCCGGGGCA GGGCAACTAC GTCGCGGCGA 

67741 GCCGTGCGCA AGGGCTGCCC GCGCAGTCCC 

67801 CGCTCACCGC GAAACTCACC GACGCGGACC 

678 61 CGTTGAGCGC CGCGGACGGC ATGCGGCTGT 

67 921 TCGTCGTCGC GACGACCGTC GACCTCACCC 

67 981 GCGGTCTGGC CGCGCACCGG GCCGGGCCGG 
68041 AGCCCCTGGC CGTGCGTCTT GCCGGGCGTA 
68101 AGGTCGTGCT CCGCCACGCG GCCGCGGTCC 
68161 CGGACCGTCC GTTCCGCGAG CTCGGTTTCG 
68221 GGCTCGCGGC CGAGACGGGG CTGCGGCTGC 
68281 CGGAGGCGCT CACCGCCCAC CTGCTCGACC 
68341 GGGAGTCCCT GCCCGCGGTG ACGGCCGCTC 
684 01 CGATCGCCAT CGTGGCGATG GCGTGCCGGC 
684 61 TGTGGCGGCT CGTCGAGTCC GGCACCGACG 
68521 GGGACGTCGA CGCGCTGTAC GACGCGGACC 
68581 GGGGCGGTTA CCTGGCCGGG GCGGCGGAGT 
68641 GCGAAGCGCT CGGCATGGAC CCGCAGCAAC 
68701 TCGAGCGCGG CCGGATCAGT CCGGCGTCGC 
687 61 GTGCGGCCGC GCAGGGCTAC GGGCTGGGCG 
68821 GTGGTTCCAC GAGCCTGCTG TCCGGACGGC 
68881 CGGTCACCGT GGACACGGCG TGCTCGTCGT 

68 941 GGCTGCGCCT GGGCGAGTGC GAACTCGCTC 
69001 CGGCCGCGTT CGTGGAGTTC TCCCGCCAGC 
69061 CGTTCGGCGC GGGCGCGGAC GGCACGACGT 
69121 AACGGCTCTC CGACGCCGAG CGGCTCGGGC 
69181 CCGTCACGTC CGACGGCGCC TCCAACGGCC 
69241 GGGTCATCCG GAAGGCGCTC, GCCGCGGCCG 



TCGGGGAGAC CCGGGACCTG ACCACCCGTG 
CGGTGATCTT CCAGGTGACC GGTGGCCTCG 
CCGCTCAGAA CGAGCAGCCC GGCCGCTTCT 
TGGACGGCGC GAAGCGCGAC GCGATCGCGG 
ACGGCCTCTT CGAGGCAGCC CGGCTGATGC 
CCGGGTCGTG GCAGCTGCGG CCGTCCGCCA 
CCACCGACGC CCCGGACCGG CCGCTCGCGG 
CGGGCCTGAA CTTCCGGGAT GTCACGGTCG 
TCGGCAGCGA GGCCGCGGGT GTCGTCCTGG 
CCGGCGACCG GGTCCTGGGG ATGCTCGCGG 
GGCGGCTGCT CGGCCGGATG CCGGACGGCT 
CCGCGTTCGC GACCGCGTGG TACGGCCTGG 
AGGTCCTGAT CCACGCGGCG GCGACCGGTG 
ATCTGGGCGC GGAGGTGTAC GCGACCACCA 
ACGGAGCGCA TCTGGCCGAT TCCCGCAGCA 
ATGTCGTGCT CAACTCGCTC ACCGGTGAAT 
CGGGTGGCCG GTTCATCGAG ATGGGGAAGA 
TCGACCTGAT GGACGCCGGC CCCGACCGGA 
TGTTCGCGCG CGACGTGCTG CACCCGCTGC 
GGGAGGCGTT CGGCTGGATG AGCAGCGGGC 
CGCGGCCGCT GGATCCCGAG GGGGCCGTCG 
GCATCCTCGC CCGCCACCTG GGCCACCCCC 
CCGACACCAC CCCCGGCACC CACCTCCCCT 
CCACCCTCGC CCGCATCCCC CAACCCCTCA 
ACGACGCCCT GCTCGACAAC CTCACCCCCG 
CCGACGCCGC CTGGCACCTG CACCGGCTCA 
TCTACTCCGC GGTCGCCGGC CTCATGGGCA 
ACGCGTTCCT CGACGCGCTC GCCGAACACC 
TCGCATGGGG CATGTGGGCG GACGTCAGCG 
GCCAGCGCAT CCGGCGCAGC GGATTCCCGC 
TCGACGCGGC GACGCGTACC CCGGAACCGG 
AGCTCGACGG CGCCGTCGCG CCGTTGCTCC 
CGCGCACGGT CGCCCGCAAC GCCGGCGAAG 
CCGCCGCCGA GCAGCGGCGC ATCATGCAGG 
TCGCGTACGG GCTGGGCGAC CGCGTGGCGG 
ATTCGCTGAC CGCGGTCGAC CTGCGCAATC 
CGACGACGCT GGTGTTCAGC CACCCGACGG 
TGATCGACGC TCCCACCGCC CGGATCGCCG 
CCGTGGCGGC CGCGCGGGAC CAGGACGAGC 
TGCCCGGTGG TGTGACGTCG CCCGAGGACC 
CGATCACCAC GCCTCCTGAC GACCGCGGCT 
CGGACGCGGC CGGCAAGGCG TACAACCTGC 
TCGACGCGGC GTTCTTCGAC ATCAGTCCGC 
GCCTGCTGCT CGAAACGGCG TGGGAGGCGA 
TCCGCGGCCG GGAGGTCGGC GTCTATGTCG 
CCGAGGACAC CGAGGGCCAC GCGATCACCG 
TGGCGTACGT GCTCGGGCTG GAGGGCCCGG 
CTCTGGTCGC GCTGCATCTG GCGTGCCAGG 
TGGCCGGAGG GGTCTCCGTA CTGAGTTCGC 
GCGGGCTCGC GGCCGACGGG CGCTGCAAGT 
GGTCCGAGGG CGTGGGCGTG CTCGTACTGG 
ACACCGTGCT CGCCGTCGTC CGCGGCAGCG 
TCACCGCGCC GAACGGGCTC TCGCAGCAGC 
GGCTGACCGG CGCCGACGTG GACGTCGTCG 
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69301 AGGGGCACGG CACCGGCACC CGGCTCGGCG ACCCGGTCGA GGCGGACGCG CTGCTCGCGA 
69361 CGTACGGGCA GGACCGTCCG GCACCGGTCT GGCTGGGCTC GCTGAAGTCG AACATCGGAC 
69421 ATGCCACGGC CGCGGCCGGT GTCGCGGGCG TCATCAAGAT GGTGCAGGCG ATCGGCGCGG 
69481 GCACGATGCC GCGGACGCTG CATGTGGAGG AGCCCTCGCC CGCCGTCGAC TGGAGCACCG 
5 69541 GACAGGTGTC CCTGCTCGGC TCCAACCGGC CCTGGCCGGA CGACGAGCGT CCGCGCCGGG 
69601 CGGCCGTCTC CGCGTTCGGG CTCAGCGGGA CGAACGCGCA CGTCATCCTG GAACAGCACC 
69661 GTCCGGCGCC CGTGGCGTCC CAGCCGCCCC GGCCGCCCCG TGAGGAGTCC CAGCCGCTGC 
69721 CGTGGGTGCT CTCCGCGCGG ACTCCGGCCG CGCTGCGGGC CCAGGCGGCC CGGCTGCGCG 
697 81 ACCACCTCGC GGCGGCACCG GACGCGGATC CGTTGGACAT CGGGTACGCG CTGGCCACCA 

10 69841 GCCGCGCCCA GTTCGCCCAC CGTGCCGCGG TCGTCGCCAC CACCCCGGAC GGATTCCGTG 
69901 CCGCGCTCGA CGGCCTCGCG GACGGCGCGG AGGCGCCCGG AGTCGTCACC GGGACCGCTC 
69961 AGGAGCGGCG CGTCGCCTTC CTCTTCGACG GCCAGGGCGC CCAGCGCGCC GGAATGGGGC 
7 0021 GCGAGCTCCA CCGCCGGTTC CCCGTCTTCG CCGCCGCGTG GGACGAGGTC TCCGACGCGT 
7 0081 TCGGCAAGCA CCTCAAGCAC TCCCCCACGG ACGTCTACCA CGGCGAACAC GGCGCTCTCG 

15 7 0141 CCCATGACAC CCTGTACGCC CAGGCCGGCC TGTTCACGCT CGAAGTGGCG CTGCTGCGGC 
70201 TGCTGGAGCA CTGGGGGGTG CGGCCGGACG TGCTCGTCGG GCACTCCGTC GGCGAGGTGA 
7 0261 CCGCGGCGTA CGCGGCGGGG GTGCTCACCC TGGCGGACGC GACGGAGTTG ATCGTGGCCC 
7 0321 GGGGGCGGGC GCTGCGGGCG CTGCCGCCCG GGGCGATGCT CGCCGTCGAC GGAAGCCCGG 
7 0381 CGGAGGTCGG CGCCCGCACG GATCTGGACA TCGCCGCGGT CAACGGCCCG TCCGCCGTGG 

20 70441 TGCTCGCCGG TTCGCCGGAC GATGTGGCGG CGTTCGAACG GGAGTGGTCG GCGGCCGGGC 
70501 GGCGCACGAA ACGGCTCGAC GTCGGGCACG CGTTCCACTC CCGGCACGTC GACGGTGCGC 
70561 TCGACGGCTT CCGTACGGTG CTGGAGTCGC TCGCGTTCGG CGCGGCGCGG CTGCCGGTGG 
7 0621 TGTCCACGAC GACGGGCCGG GACGCCGCGG ACGACCTCAT AACGCCCGCG CACTGGCTGC 
7 0681 GCCATGCGCG TCGGCCGGTG CTGTTCTCGG ATGCCGTCCG GGAGCTGGCC GACCGCGGCG 

25 7 0741 TCACCACGTT CGTGGCCGTC GGCCCCTCCG GCTCCCTGGC GTCGGCCGCG GCGGAGAGCG- 
70801 CCGGGGAGGA CGCCGGGACC TACCACGCGG TGCTGCGCGC CCGGACCGGT GAGGAGACCG 
70861 CGGCGCTGAC CGCCCTCGCC GAGCTGCACG CCCACGGCGT CCCGGTCGAC CTGGCCGCGG 
7 0921 TACTGGCCGG TGGCCGGCCA GTGGACCTTC CCGTGTACGC GTTCCAGCAC CGTTCCTACT 
70 981 GGCTGGCCCC GGCCGTGGCG GGGGCGCCGG CCACCGTGGC GGACACCGGG GGTCCGGCGG 

30 71041 AGTCCGAGCC GGAGGACCTC ACCGTCGCCG AGATCGTCCG TCGGCGCACC GCGGCGCTGC 
71101 TCGGCGTCAC GGACCCCGCC GACGTCGATG CGGAAGCGAC GTTCTTCGCG CTCGGTTTCG 
71161 ACTCACTGGC GGTGCAGCGG CTGCGCAACC AGCTCGCCTC GGCAACCGGG CTGGACCTGC 
71221 CGGCGGCCGT CCTGTTCGAC CACGACACCC CGGCCGCGCT CACCGCGTTC CTCCAGGACC 
71281 GGATCGAGGC CGGCCAGGAC CGGATCGAGG CCGGCGAGGA CGACGACGCG CCCACCGTGC 

35 71341 TCTCGCTCCT GGAGGAGATG GAGTCGCTCG ACGCCGCGGA CATCGCGGCG ACGCCGGCCC 
71401 CGGAGCGTGC GGCCATCGCC GATCTGCTCG ACAAGCTCGC CCATACCTGG AAGGACTACC 
714 61 GATGAGCACC GATACGCACG AGGGAACGCC GCCCGCCGGC CGCTGCCCAT TCGCGATCCA 
71521 GGACGGTCAC CGCGCCATCC TGGAGAGCGG CACGGTGGGT TCGTTCGACC TGTTCGGCGT 
71581 CAAGCACTGG CTGGTCGCCG CCGCCGAGGA CGTCAAGCTG GTCACCAACG ATCCGCGGTT 

40 71641 CAGCTCGGCC GCGCCGTCCG AGATGCTGCC CGACCGGCGG CCCGGCTGGT TCTCCGGGAT 
71701 GGACTCACCG GAGCACAACC GCTACCGGCA GAAGATCGCG GGGGACTTCA CACTGCGCGC 

717 61 GGCGCGCAAG CGGGAGGACT TCGTCGCCGA GGCCGCCGAC GCCTGCCTGG ACGACATCGA 
71821 GGCCGCGGGA CCCGGCACCG ACCTCATCCC CGGGTACGCC AAGCGGCTGC CCTCCCTCGT 

718 81 CATCAACGCG CTGTACGGGC TCACCCCTGA GGAGGGGGCC GTGCTGGAGG CACGGATGCG 
45 71941 CGACATCACC GGCTCGGCCG ATCTGGACAG CGTCAAGACG CTGACCGACG ACTTCTTCGG 

72001 GCACGCGCTG CGGCTGGTCC GCGCGAAGCG TGACGAGCGG GGCGAGGACC TGCTGCACCG 
72061 GCTGGCCTCG GCCGACGACG GCGAGATCTC GCTCAGCGAC GACGAGGCGA CGGGCGTGTT 
72121 CGCGACGCTG CTGTTCGCCG GCCACGACTC GGTGCAGCAG ATGGTCGGCT ACTGCCTCTA 
72181 CGCACTGCTC AGCCACCCCG AGCAGCAGGC GGCGCTGCGC GCGCGCCCGG AGCTGGTCGA 
50 72241 CAACGCGGTC GAGGAGATGC TCCGTTTCCT GCCCGTCAAC CAGATGGGCG TACCGCGCGT 
72301 CTGTGTCGAG GACGTCGATG TGCGGGGCGT GCGCATCCGT GCGGGCGACA ACGTGATCCC 
72361 GCTCTACTCG ACGGCCAACC GCGACCCCGA GGTGTTCCCG CAGCCCGACA CCTTCGATGT 
724 21 GACGCGCCCG CTGGAGGGCA ACTTCGCGTT CGGCCACGGC ATTCACAAGT GTCCCGGCCA 
724 81 GCACATCGCC CGGGTGCTCA TCAAGGTCGC CTGCCTGCGG TTGTTCGAGC GTTTCCCGGA 
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72541 CGTCCGGCTG GCCGGCGACG 
7 2601 GCTGCGGGTC ACCTGGGGGG 
7 2661 GGGACGACGG TCGCGCACAT 
72721 ACCCAGCGCT GCTACCTGCG 
7 2781 GTCGGCGCGA ACATCGGCAT 
72841 GTGCACGCCT TCGAGCCCGC 
72 901 CACGGCATCC CGGGCCAGGC 
72 961 ATGACCTTCT ATCCCGACGC 
7 3021 ACGGAGCTGT TGCGCACGCT 
73081 ATGCTCGCGC AACTGCCCGA 
7 3141 GACGTCATCG CGGAGCGCGG 
7 3201 AGCGAACGGC AGGTCTTCGC 
7 3261 GTCGCGGAGG TCCACGACAT 
73321 CATGGCTTCA CCGTGGTCGC 
73381 GTCGCCGCGC GGCGGGTGGC 
7 3441 GCCGCGGTGC GGACGGCGGC 
73501 CCCTTCACCC CCAGCTTGCG 
7 3561 ACGAACAGCT GGCTGGCGAT 
7 3621 CGCCGCTCCG CCTCGGTCAG 
73681 TCCGCGTCCG AGGACTCCCC 
73741 GCGAGGTGCC GTGCGCGGCG 
73801 CACGCTTCGC CCATGTCGGC 
7 3861 AGCAGATCGG CGGCCTCGTC 
73921 TGCACCCGCA GCGTCATCAC 
7 3981 ATGAGCCTCA GCCCCTCGTC 
74 041 ACCCGCCACA GGGCCAGGCC 
74101 TCCCGGAACG CGTTGTACGC 
74161 GCCCAGACCA TGTGCAGTCC 
74 221 AGCCACCGCT CCGCCCGGTC 
74 281 AGCGGCAATG CGGCGGCCAT 
7 4 341 CCGCATTCGA CGGCGGCGGT 
74 4 01 GCGTGGACCG CCTCGTCGGC 
74 4 61 CAGGACTGGA CGGCATCGGT 
7 4 521 GTGGTCCGGT CCGTCGTGAC 
7 4 581 TGTTCGGACC AGCCGCGCAG 
74 641 ACGGCTCCGG AAAACGAGGC 
74 701 TCGGCCGCGC CGGGATAGAT 
74 7 61 CCCTGCTCGC TCGGGGCGGC 
7 4 821 CGCCCGTCCA TCGCCAGCCA 
74 881 TCCCGCGACG CGGTGAGCAG 
7 4 941 CGCTCGATGG CGGCGGTGTC 
7 5001 CGGTAGGCGA ACTCCAGGTA 
7 5061 CGCGCGGCGT CGGTGAACAG 
7 5121 TGGTGGCGGG CGAGCACCTT 
7 5181 TCGTGCAGGC CACGCCGCTC 
75241 GGGTGCGGGA ACCGCCCTTC 
7 5301 TCGACCGCCT CGGTGTCGAG 
7 5361 CCGAGCACGG CGGAAGCTCG 
7 5421 CCGAGGTAGG CGAGCCGGTA 
7 5481 GTCCGTGCCT CCCGGATGTC 
7 5541 GCCCGGAACG CCTGGGCCAC 
75601 AGTTCGGTGG TCTGCGCCTC 
7 5661 CTCAGCAGTG CCGCCCGGAA 
7 5721 ACGATGGCGA CACGGGCCCG 
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TGCCGATGAA CGAGGGGCTC GGGCTGTTCA GCCCGGCCGA 
CGGCATGAGT CACCCGGTGG AGACGTTGCG GTTGCCGAAC 
CAACGCGGGC GAGGCGCAGT TCCTCTACCG GGAGATCTTC 
CCACGGTGTC GACCTGCGCC CGGGGGACGT GGTGTTCGAC 
GTTCACGCTT TTCGCGCATC TGGAGTGTCC TGGTGTGACC 
GCCCGTGCCG TTCGCGGCGC TGCGGGCGAA CGTGACGCGG 
GGACCAGTGC GCGGTCTCCG ACAGCTCCGG CACCCGGAAG 
CACGCTGATG TCCGGTTTCC ACGCGGATGC CGCGGCCCGG 
CGGCCTCAAC GGCGGCTACA CCGCCGAGGA CGTCGACACC 
CGTCAGCGAG GAGATCGAAA CCCCTGTGGT CCGGCTCTCC 
TATCGAGGCC ATCGGCCTGC TGAAGGTCGA CGTGGAGAAG 
CGGCCTCGAG GACACCGACT GGCCCCGTAT CCGCCAGGTC 
CGACGGCGCG CTCGAGGAGG TCGTCACGCT GCTCCGCG.GC 
CGAGCAGGAA CCGCTGTTCG CCGGCACGGG CATCCACCAG 
CGGCTGAGCG CCGTCGGGGC CGCGGCCGTC CGCACCGGCG 
TCAGCCGGCG TCGGACAGTT CCTTGGGCAG TTGCTGACGG 
GAACACGTTG GTGAGGTGCT GTTCCACCGT GCTGGAGGTG 
CTCCTTGTTG GTGCGCCCGA CCGCGGCGTG CGACGCCACC 
CGATGTGATC CGCTGCGCCG GCGTCACGTC CTGGGTGCCG 
ACCGAGCCGC CGGAGGAGCG GCACGGCTCC GCACTGGGTC 
GAACAGTCCC CGCGCACGGC TGTGCCGCCG GAGCATGCCG 
GAGGACGCGG GCCAGCTCGT ACTGGTCGCG GCACATGATG 
GAGCAGTTCG ATCCGCTTGG CCGGCGGACT GTAGGCCGCC 
CCGCGCCCGG GACCCCATCG GCCGGGACAG CTGCTCGGAG 
ACGGCCGCGG CCGAGCAGCA GAAGCGCTTC GGCGGCGTCG 
CGGCACGTCG ACGGACCAGC GTCGCATCCG CTCCCCGCAG 
CGCCCGGTAC CGCCCGGCCG CGAGATGGTG TTGCCCACGG 
GAAGAGGCTG TCGGAGGTCT CCTCCGGCAA CGGCTCGGCG 
CAGGTCGCCC AGTCGGATCG CGGCGGCCAC GGTGCTGCTC 
CCCCCAGGAG GGCACGACCC GGGGGGCGAG CGCGGCCTCG 
CAGGTCGCCG CGGCGCAGCG CGGCCTCGGC GCGGAACCCC 
CGGGGTCCGC ATGTTGTCGT CACCGGCCAG CTTGTCGACC 
GTCCTCGGCG TAGAGCAGGG CCAGCAACGC CATCATGGTC 
CCGGGAGTGC TGGAGCACGT ACTCGGCTTT GGCCTCGGCC 
CGCGTTGCTC AGGGCCTTGT CGGCGACGGC GCGGTGCCGG 
GACCTCGTCC TCGGCCGGCG GATCGGCCGG ACGCGGCGGA 
CAGCGCGAGG GACAGGTCCG CGACGCGCAG GTGCGCCCGG 
GGAGCGCTGG GCCGCCAGGA CCTCGGCGGC CTCGCCCGGC 
GCAGGCGAGC GACACGGCGT GCTCGCTGGA GAGGAGCCGT 
CTCGGGCACA TGCCGGCCGG ATCTGGCGGG . ATCGCAGAGC 
GACGCGCAGT GCGGCGTGGA CGGCGGGGTC GTCGGAGGCC 
GGTGACGGCC TCGTCGAGCT CGCCGCGCAG GTGGTGCTCG 
CCCGGCGACC TCGGCGCCGT GCACCCGGCC GGTACCCATC 
GCTGGCCACG CCGCGGTCCC GCAGCAGTTC CAGCGCCAGC 
GGCGGCGGAG AGGTCGTCGA GTACGACGGA GCGGGCCGCG 
CCGCAGCAGC CGCCCCTCGA CCAGCTGTTC GTGGGCCTGC 
GCCGGTCATC CGCTGGACGA GGGTGAGTTC GACACTCTCG 
GGCGACGCTC AGCGCGGCCG GGCCGCAACG ATAGAGCGAC 
CGCCCGCCCC GCGACCACTT CCAGGCACCC TGAGGTCCGT 
GTCGATCAGG CCGTGGCCGA GGAGCAGGTT GCCGCCGGTC 
CACGTCGTCG TGCGCGTCCT GGCCGAGGTG CCGGCGCACG 
GGTGAGCGGG CGCAGCGCGA TCTCCTGGTA GTGGCGCAGA 
TTGGGAGTGG GCGGGCGTCG GCCGGAGCAG CTCGGTCAGC 
GCTGATGCGG CGCGCGAGGT GGAGCAGGCA GCGCAGCGAC 
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7 5781 GGCGCGTCGG CGTGGTGCAC GTCGTCGATG CCGATCAGTA CGGGCCGCTC CGCGGCGAGC 

7 5841 GTCAGCACCG TGCGGGTGAG TTCGGTCCCC AGGCGGTTGT CGACGTCGGC CGGCAGGTTT 

7 5901 TCGCACGATG CCGTCAGCCG GACCAGCTCC GGTGTCCGGG CGGCCAGCTC GGGCTGGTCG 

75961 AGGAGCTGGC CGAGCATGCC GTACGGCAGG GCCCGCTCCT CCATGGAGCA CACCGCGCGA 

7 6021 AGGGTGACGA AGCCGGCCTT GGCCGCGGCG GCGTCGAGGA GTTCGGTCTT GCCGCAGGCG 

76081 ATCGGCCCGG TGACGGCGGC GACGACGCCC CGCCCGCCCC CCGCTCGGGT GAGCGCCCGG 

7 6141 TGGAGGGAAC CGAACTCGTC ATCGCGGGCG ATCAGGTCTG GGGGAGATAA GCGCGCTATC 

7 6201 ACGAATGGAA CTACCTCGCG ACCGTCGTGG AAACCCATAG GCATCACATG GCTTGTTGAT 

7 6261 CTGTACGGCT GTGATTCAGC CTGGCGGGAT GCTGTGCTAC AGATGGGAAG ATGTGATCTA 

7 6321 GGGCCGTGCC GTTCCCTCAG GAGCCGACCG CCCCCGGCGC CACCCGCCGT ACCCCCTGGG 

7 6381 CCACCAGCTC GGCGACCCGC TCCTGGTGGT CGACGAGGTA GAAGTGCCCG CCGGGGAAGA 

7 6441 CCTCCACCGT GGTCGGCGCG GTCGTGTGCC CGGCCCAGGC GTGGGCCTGC TCCACCGTCG 

7 6501 TCTTCGGATC GTCGTCACCG ATGCACACCG TGATCGGCGT CTCCAGCGGC GGCGCGGGCT 

7 6561 CCCACCGGTA CGTCTCCGCC GCGTAGTAGT CCGCCCGCAA CGGCGCCAGG fcTCAGCGCGC 

7 6621 GCATTTCGTC GTCCGCCATC ACATCGGCGC TCGTCCCGCC GAGGCCGATG ACCGCCGCCA 

7 6681 GCAGCTCGTC GTCGGACGCG AGGTGGTCCT GGTCGGCGCG CGGCTGCGAC GGCGCCCGCC 

7 6741 GGCCCGAGAC GATCAGGTGC GCCACCGGGA GCCGCTGGGC CAGCTCGAAC GCGAGTGTCG 

7 6801 CGCCCATGCT GTGGCCGAAC AGCACCAGCG GACGGTCCAG CCCCGGCTTC AACGCCTCGG 

7 68 61 CCACGAGGCC GGCGAGAACA CGCAGGTCGC GCACCGCCTC CTCGTCGCGG CGGTCCTGGC 

7 6921 GGCCGGGGTA CTGCACGGCG TACACGTCCG CCACCGGGGC GAG C GC AC GG GCCAGCGGAA 

7 6981 GGTAGAACGT CGCCGATCCG CCGGCGTGGG GCAGCAGCAC CACCCGTACC GGGGCCTCGG 

7 7041 GCGTGGGGAA GAACTGCCGC AGCCAGAGTT CCGAGCTCAC CGCACCCCCT CGGCCGCGAC 

77101 CTGGGGAGCC CGGAACCGGG TGATCTCGGC CAAGTGCTTC TCCCGCATCT CCGGGTCGGT 

77161 CACGCCCCAT CCCTCCTCCG GCGCCAGACA GAGGACGCCG ACTTTGCCGT . TGTGCACATT 

7 7221 GCGATGCACA TCGCGCACCG CCGACCCGAC GTCGTCGAGC GGGTAGGTCA CCGACAGCGT 

7 7281 CGGGTGCACC ATCCCCTTGC AGATCAGGCG GTTCGCCTCC CACGCCTCAC GATAGTTCGC 

77341 GAAGTGGGTA CCGATGATCC GCTTCACGGA CATCCACAGG TACCGATTGT CAAAGGCGTG 

774 01 CTCGTATCCC GAGGTTGACG CGCAGGTGAC ' GATCGTGCCA CCCCGACGTG TCACGTAGAC 

774 61 ACTCGCGCCG AACGTCGCGC GCCCCGGGTG CTCGAACACG ATGTCGGGAT CGTCACCGCC 

77521 GGTCAGCTCC CGGATC 



Those of skill in the art will recognize that, due to the degenerate nature of the 
genetic code, a variety of DNA compounds differing in their nucleotide sequences can be 
used to encode a given amino acid sequence of the invention. The native DNA sequence 
encoding the FK-520 PKS of Streptomyces hygroscopicus is shown herein merely to 
illustrate a preferred embodiment of the invention, and the present invention includes 
DNA compounds of any sequence that encode the amino acid sequences of the 
polypeptides and proteins of the invention. In similar fashion, a polypeptide can typically 
tolerate one or more amino acid substitutions, deletions, and insertions in its amino acid 
sequence without loss or significant loss of a desired activity. The present invention 
includes such polypeptides with alternate amino acid sequences, and the amino acid 
sequences shown merely illustrate preferred embodiments of the invention. 
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The recombinant nucleic acids, proteins, and peptides of the invention are many 
and diverse. To facilitate an understanding of the invention and the diverse compounds 
and methods provided thereby, the following general description of the FK-520 PKS 
genes and modules of the PKS proteins encoded thereby is provided. This general 
5 description is followed by a more detailed description of the various domains and 
modules of the FK-520 PKS contained in and encoded by the compounds of the 
invention. In this description, reference to a heterologous PKS refers to any PKS other 
than the FK-520 PKS. Unless otherwise indicated, reference to a PKS includes reference 
to a portion of a PKS. Moreover, reference to a domain, module, or PKS includes 
1 0 reference to the nucleic acids encoding the same and vice-versa, because the methods and 
reagents of the invention provide or enable one to prepare proteins and the nucleic acids 
that encode them. 

The FK-520 PKS is composed of three proteins encoded by three genes 
designated fkbA,fkbB, and flcbC. The flcbA ORF encodes extender modules 7 - 10 of the 
1 5 PKS. The fkbB ORF encodes the loading module (the Co A ligase) and extender modules 
1 - 4 of the PKS. The fkbC ORF encodes extender modules 5 - 6 of the PKS. The JkbP 
ORF encodes the NRPS that attaches the pipecolic acid and cyclizes the FK-520 
polyketide. 

The loading module of the FK-520 PKS includes a CoA ligase, an ER domain, 
20 and an ACP domain. The starter building block or unit for FK-520 is believed to be a 
dihydroxycyclohexene carboxylic acid, which is derived from shikimate. The 
recombinant DN A compounds of the invention that encode the loading module of the 
FK-520 PKS and the corresponding polypeptides encoded thereby are useful for a variety 
of methods and in a variety of compounds. In one embodiment, a DNA compound 
25 comprising a sequence that encodes the FK-520 loading module is inserted into a DNA 
compound that comprises the coding sequence for a heterologous PKS. The resulting 
construct, in which the coding sequence for the loading module of the heterologous PKS 
is replaced by the coding sequence for the FK-520 loading module, provides a novel PKS 
coding sequence. Examples of heterologous PKS coding sequences include the 
30 rapamycin, FK-506, rifamycin, and avermectin PKS coding sequences. In another 
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embodiment, a DNA compound comprising a sequence that encodes the FK-520 loading 
module is inserted into a DNA compound that comprises the coding sequence forthe FK- 
520 PKS or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the loading module coding sequence is 
5 utilized in conjunction with a heterologous coding sequence. In this embodiment, the 
invention provides, for example, either replacing the CoA ligase with a different CoA 
ligase, deleting the ER, or replacing the ER with a different ER. In addition, or 
alternatively, the ACP can be replaced by another ACP. In similar fashion, the 
corresponding domains in another loading or extender module can be replaced by one or 

10 more domains of the FK-520 PKS. The resulting heterologous loading module coding 
sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. 

The first extender module of the FK-520 PKS includes a KS domain, an AT 
domain specific for methylmalonyl CoA, a DH domain, a KR domain, and an ACP 

1 5 domain. The recombinant DNA compounds of the invention that encode the first 
extender module of the FK-520 PKS and the corresponding polypeptides encoded 
thereby are useful for a variety of applications. In one embodiment, a DNA compound 
comprising a sequence that encodes the FK-520 first extender module is inserted into a 
DNA compound that comprises the coding sequence for a heterologous PKS. The 

20 resulting construct, in which the coding sequence for a module of the heterologous PKS 
is either replaced by that for the first extender module of the FK-520 PKS or the latter is 
merely added to coding sequences for modules of the heterologous PKS, provides a novel 
PKS coding sequence. In another embodiment, a DNA compound comprising a sequence 
that encodes the first extender module of the FK-520 PKS is inserted into a DNA 

25 compound that comprises the remainder of the coding sequence for the FK-520 PKS or a 
recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, all or only a portion of the first extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 

30 methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 



dc- 176500 



PATENT 

AttyDkt: 300622002600 



Flii 



55 



hydroxymalonyl CoA specific AT; deleting either the DH or KR or both; replacing the 
DH or KR or both with another DH or KR; and/or inserting an ER. In replacing or 
inserting KR, DH, and ER domains, it is often beneficial to replace the existing KR, DH, 
and ER domains with the complete set of domains desired from another module. Thus, if 
5 one desires to insert an ER domain, one may simply replace the existing KR and DH 
domains with a KR, DH 5 and ER set of domains from a module containing such domains. 
In addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 

1 0 from a gene for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous first extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the first 

1 5 extender module of the FK-520 PKS. 

In an illustrative embodiment of this aspect of the invention, the invention 
provides recombinant PKSs and recombinant DNA compounds and vectors that encode 
such PKSs in which the KS domain of the first extender module has been inactivated. 
Such constructs are especially useful when placed in translational reading frame with the 

20 remaining modules and domains of an FK-520 or FK-520 derivative PKS. The utility of 
these constructs is that host cells expressing, or cell free extracts containing, the PKS 
encoded thereby can be fed or supplied with N-acylcysteamine thioesters of novel 
precursor molecules to prepare FK-520 derivatives. See U.S. patent application Serial 
No. 60/1 17,384, filed 27 Jan. 1999, and PCT patent publication Nos. US97/02358 and 

25 US99/03986, each of which is incorporated herein by reference. 

The second extender module of the FK-520 PKS includes a KS, an AT specific 
for methylmalonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA 
compounds of the invention that encode the second extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 

30 applications. In one embodiment, a DNA compound comprising a sequence that encodes 
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the FK-520 second extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the second 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the second extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the second extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
methylmalonyl CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous second extender module coding sequence 
can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 
520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the second extender module of the FK-520 PKS. 

The third extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, a KR, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the third extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
third extender module is inserted into a DNA compound that comprises the coding 
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sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the third extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
5 embodiment, a DNA compound comprising a sequence that encodes the third extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, all or a portion of the third extender module coding 
10 sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; deleting the KR and/or the inactive DH; replacing the 
KR with another KR; and/or inserting an active DH or an active DH and an ER. In 
1 5 addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding 
sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous third extender module coding sequence 



O 
W 

WIS- 

20 can be utilized in conjunction with a coding sequence from a PKS that synthesizes FK- 



520, an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 
domains in a module of a heterologous PKS can be replaced by one or more domains of 
the third extender module of the FK-520 PKS. 

The fourth extender module of the FK-520 PKS includes a KS, an AT that binds 

25 ethylmalonyl CoA, an inactive DH, and an ACP. The recombinant DNA compounds of 
the invention that encode the fourth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
fourth extender module is inserted into a DNA compound that comprises the coding 

30 sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
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for a module of the heterologous PKS is either replaced by that for the fourth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the fourth extender 
5 module of the FK-520 PKS is inserted into a DNA compound that comprises the 

remainder of the coding sequence for the FK-520 PKS or a recombinant FK-520 PKS 
that produces an FK-520 derivative. 

In another embodiment, a portion of the fourth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 

10 this embodiment, the invention provides, for example, either replacing the ethylmalonyl 
CoA specific AT with a malonyl CoA, methylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; and/or deleting the inactive DH, inserting a KR, a KR and an active DH, or a 
KR, an active DH, and an ER. In addition, the KS and/or ACP can be replaced with 
another KS and/or ACP. In each of these replacements or insertions, the heterologous KS, 

15 AT, DH, KR, ER, or ACP coding sequence can originate from a coding sequence for 
another module of the FK-520 PKS, a PKS for a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous fourth extender module coding sequence 
can be utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, 
an FK-520 derivative, or another polyketide. In similar fashion, the corresponding 

20 domains in a module of a heterologous PKS can be replaced by one or more domains of 
the fourth extender module of the FK-520 PKS. 

As illustrative examples, the present invention provides recombinant genes, 
vectors, and host cells that result from the conversion of the FK-506 PKS to an FK-520 
PKS and vice- versa. In one embodiment, the invention provides a recombinant set of FK- 

25 506 PKS genes but in which the coding sequences for the fourth extender module or at 
least those for the AT domain in the fourth extender module have been replaced by those 
for the AT domain of the fourth extender module of the FK-520 PKS. This recombinant 
PKS can be used to produce FK-520 in recombinant host cells. In another embodiment, 
the invention provides a recombinant set of FK-520 PKS genes but in which the coding 

30 sequences for the fourth extender module or at least those for the AT domain in the fourth 



dc- 176500 




PATENT 

AttyDkt: 300622002600 



-59- 



PI 

I U 
f \% 

fS 15 



extender module have been replaced by those for the AT domain of the fourth extender 
module of the FK-506 PKS. This recombinant PKS can be used to produce FK-506 in 
recombinant host cells. 

Other examples of hybrid PKS enzymes of the invention include those in which 
the AT domain of module 4 has been replaced with a malonyl specific AT domain to 
provide a PKS that produces 21-desethyl-FK520 or with a methylmalonyl specific AT 
domain to provide a PKS that produces 21 -desethy 1-21 -methyl-FK5 20. Another hybrid 
PKS of the invention is prepared by replacing the AT and inactive KR domain of FK-520 
extender module 4 with a methylmalonyl specific AT and an active KR domain, such as, 
for example, from module 2 of the DEBS or oleandolide PKS enzymes, to produce 21- 
desethyl-21-methyl-22-desoxo-22-hydroxy-FK520. The compounds produced by these 
hybrid PKS enzymes are neurotrophins. 

The fifth extender module of the FK-520 PKS includes a KS, an AT that binds 
methylmalonyl Co A, a DH, a KR, and an ACP. The recombinant DNA compounds of the 
invention that encode the fifth extender module of the FK-520 PKS and the 
corresponding polypeptides encoded thereby are useful for a variety of applications. In 
one embodiment, a DNA compound comprising a sequence that encodes the FK-520 fifth 
extender module is inserted into a DNA compound that comprises the coding sequence 
for a heterologous PKS. The resulting construct, in which the coding sequence for a 
module of the heterologous PKS is either replaced by that for the fifth extender module 
of the FK-520 PKS or the latter is merely added to coding sequences for the modules of 
the heterologous PKS, provides a novel PKS. In another embodiment, a DNA compound 
comprising a sequence that encodes the fifth extender module of the FK-520 PKS is 
inserted into a DNA compound that comprises the coding sequence for the FK-520 PKS 
or a recombinant FK-520 PKS that produces an FK-520 derivative. 

In another embodiment, a portion of the fifth extender module coding sequence is 
utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one or both of the DH and KR; replacing any one or both of the 
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DH and KR with either a KR and/or DH; and/or inserting an ER. In addition, the KS 
and/or ACP can be replaced with another KS and/or ACP. In each of these replacements 
or insertions, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous fifth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the fifth 
extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH domain of the fifth 
extender module have been deleted or mutated to render the DH non-functional. In one 
such mutated gene, the KR and DH coding sequences are replaced with those encoding 
only a KR domain from another PKS gene. The resulting PKS genes code for the 
expression of an FK-520 PKS that produces an FK-520 analog that lacks the C-19 to C- 
20 double bond of FK-520 and has a C-20 hydroxyl group. Such analogs are preferred 
neurotrophins, because they have little or no immunosuppressant activity. This 
recombinant fifth extender module coding sequence can be combined with other coding 
sequences to make additional compounds of the invention. In an illustrative embodiment, 
the present invention provides a recombinant FK-520 PKS that contains both this fifth 
extender module and the recombinant fourth extender module described above that 
comprises the coding sequence for the fourth extender module AT domain of the FK-506 
PKS. The invention also provides recombinant host cells derived from FK-506 producing 
host cells that have been mutated to prevent production of FK-506 but that express this 
recombinant PKS and so synthesize the corresponding (lacking the C-19 to C-20 double 
bond of FK-506 and having a C-20 hydroxyl group) FK-506 derivative. In another 
embodiment, the present invention provides a recombinant FK-506 PKS in which the DH 
domain of module 5 has been deleted or otherwise rendered inactive and thus produces 
this novel polyketide. 
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The sixth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl Co A, a KR, a DH, an ER, and an ACP. The recombinant DNA - 
compounds of the invention that encode the sixth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
5 applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 sixth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the sixth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 

10 the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the sixth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

1 5 In another embodiment, a portion of the sixth extender module coding sequence is 

utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 

20 any one, two, or all three of the KR, DH, and ER with another KR, DH, and ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 

25 synthesis. The resulting heterologous sixth extender module coding sequence can be 
utilized in conjunction with a coding sequence for a PKS that synthesizes FK-520, an 
FK-520 derivative, or another polyketide. In similar fashion, the corresponding domains 
in a module of a heterologous PKS can be replaced by one or more domains of the sixth 
extender module of the FK-520 PKS. 
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In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the DH and ER domains of the 
sixth extender module have been deleted or mutated to render them no n- functional. In 
one such mutated gene, the KR, ER, and DH coding sequences are replaced with those 
5 encoding only a KR domain from another PKS gene. This can also be accomplished by 
simply replacing the coding sequences for extender module six with those for an extender 
module having a methylmalonyl specific AT and only a KR domain from a heterologous 
PKS gene, such as, for example, the coding sequences for extender module two encoded 
by the eryAI gene. The resulting PKS genes code for the expression of an FK-520 PKS 
10 that produces an FK-520 analog that has a C-18 hydroxyl group. Such analogs are 
^ preferred neurotrophic, because they have little or no immunosuppressant activity. This 

*r3 recombinant sixth extender module coding sequence can be combined with other coding 

go 

ry sequences to make additional compounds of the invention. In an illustrative embodiment, 

the present invention provides a recombinant FK-520 PKS that contains both this sixth 
FtJ 1 5 extender module and the recombinant fourth extender module described above that 

i < j i 

g- comprises the coding sequence for the fourth extender module AT domain of the FK-506 

^ PKS. The invention also provides recombinant host cells derived from FK-506 producing 

O host cells that have been mutated to prevent production of FK-506 but that express this 

id 

h\ recombinant PKS and so synthesize the corresponding (having a C-l 8 hydroxyl group) 

^ 20 FK-506 derivative. In another embodiment, the present invention provides a recombinant 
FK-506 PKS in which the DH and ER domains of module 6 have been deleted or 
otherwise rendered inactive and thus produces this novel polyketide. 

The seventh extender module of the FK-520 PKS includes a KS, an AT specific 
for 2-hydroxymalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
25 compounds of the invention that encode the seventh extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 seventh extender module is inserted into a DNA compound that comprises 
the coding sequence for a heterologous PKS. The resulting construct, in which the coding 
30 sequence for a module of the heterologous PKS is either replaced by that for the seventh 
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extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the seventh extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion or all of the seventh extender module coding 
sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 2- 
hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting the KR, the DH, and/or the ER; and/or replacing the 
KR, DH, and/or ER. In addition, the KS and/or ACP can be replaced with another KS 
and/or ACP. In each of these replacements or insertions, the heterologous KS, AT, DH, 
KR, ER, or ACP coding sequence can originate from a coding sequence for another 
module of the FK-520 PKS, from a coding sequence for a PKS that produces a polyketide 
other than FK-520, or from chemical synthesis. The resulting heterologous seventh 
extender module coding sequence can be utilized in conjunction with a coding sequence 
for a PKS that synthesizes FK-520, an FK-520 derivative, or another polyketide. In 
similar fashion, the corresponding domains in a module of a heterologous PKS can be 
replaced by one or more domains of the seventh extender module of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the seventh 
extender module has been replaced with those encoding an AT domain for malonyl, 
methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-15 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant seventh extender module coding sequence can be combined 
with other coding sequences to make additional compounds of the invention. In an 
illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
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contains both this seventh extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
from FK-506 producing host cells that have been mutated to prevent production of FK- 
5 506 but that express this recombinant PKS and so synthesize the corresponding (C-l 5- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 7 has been replaced and 
thus produces this novel polyketide. 

In another illustrative embodiment, the present invention provides a hybrid PKS 
10 in which the AT and KR domains of module 7 of the FK-520 PKS are replaced by a 

methylmalonyl specific AT domain and an inactive KR domain, such as, for example, the 
AT and KR domains of extender module 6 of the rapamycin PKS. The resulting hybrid 
PKS produces 15-desmethoxy- 15 -methyl- 16-oxo-FK-520, a neurotrophin compound. 

Cp The eighth extender module of the FK-520 PKS includes a KS, an AT specific for 

til 

^ 15 2-hydroxymalonyl CoA, a KR, and an ACP. The recombinant DNA compounds of the 

s invention that encode the eighth extender module of the FK-520 PKS and the 

r .r; corresponding polypeptides encoded thereby are useful for a variety of applications. In 

one embodiment, a DNA compound comprising a sequence that encodes the FK-520 
eighth extender module is inserted into a DNA compound that comprises the coding 
20 sequence for a heterologous PKS. The resulting construct, in which the coding sequence 
for a module of the heterologous PKS is either replaced by that for the eighth extender 
module of the FK-520 PKS or the latter is merely added to coding sequences for the 
modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the eighth extender 
25 module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the eighth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
30 this embodiment, the invention provides, for example, either replacing the 2- 
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hydroxymalonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 
malonyl CoA specific AT; deleting or replacing the KR; and/or inserting a DH or a DH 
and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding 
5 sequence can originate from a coding sequence for another module of the FK-520 PKS, 
from a coding sequence for a PKS that produces a polyketide other than FK-520, or from 
chemical synthesis. The resulting heterologous eighth extender module coding sequence 
can be utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, 
or another polyketide. In similar fashion, the corresponding domains in a module of a 
10 heterologous PKS can be replaced by one or more domains of the eighth extender module 
of the FK-520 PKS. 

In an illustrative embodiment, the present invention provides a set of recombinant 
FK-520 PKS genes in which the coding sequences for the AT domain of the eighth 
extender module has been replaced with those encoding an AT domain for malonyl, 
1 5 methylmalonyl, or ethylmalonyl CoA from another PKS gene. The resulting PKS genes 
code for the expression of an FK-520 PKS that produces an FK-520 analog that lacks the 
C-13 methoxy group, having instead a hydrogen, methyl, or ethyl group at that position, 
respectively. Such analogs are preferred, because they are more slowly metabolized than 
FK-520. This recombinant eighth extender module coding sequence can be combined 
^ 20 with other coding sequences to make additional compounds of the invention. In an 

illustrative embodiment, the present invention provides a recombinant FK-520 PKS that 
contains both this eighth extender module and the recombinant fourth extender module 
described above that comprises the coding sequence for the fourth extender module AT 
domain of the FK-506 PKS. The invention also provides recombinant host cells derived 
25 from FK-506 producing host cells that have been mutated to prevent production of FK- 
506 but that express this recombinant PKS and so synthesize the corresponding (C-13- 
desmethoxy) FK-506 derivative. In another embodiment, the present invention provides a 
recombinant FK-506 PKS in which the AT domain of module 8 has been replaced and 
thus produces this novel polyketide. 
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The ninth extender module of the FK-520 PKS includes a KS, an AT specific for 
methylmalonyl CoA, a KR, a DH, an ER, and an ACP. The recombinant DNA 
compounds of the invention that encode the ninth extender module of the FK-520 PKS 
and the corresponding polypeptides encoded thereby are useful for a variety of 
applications. In one embodiment, a DNA compound comprising a sequence that encodes 
the FK-520 ninth extender module is inserted into a DNA compound that comprises the 
coding sequence for a heterologous PKS. The resulting construct, in which the coding 
sequence for a module of the heterologous PKS is either replaced by that for the ninth 
extender module of the FK-520 PKS or the latter is merely added to coding sequences for 
the modules of the heterologous PKS, provides a novel PKS coding sequence. In another 
embodiment, a DNA compound comprising a sequence that encodes the ninth extender 
module of the FK-520 PKS is inserted into a DNA compound that comprises the coding 
sequence for the remainder of the FK-520 PKS or a recombinant FK-520 PKS that 
produces an FK-520 derivative. 

In another embodiment, a portion of the ninth extender module coding sequence 
is utilized in conjunction with other PKS coding sequences to create a hybrid module. In 
this embodiment, the invention provides, for example, either replacing the methylmalonyl 
CoA specific AT with a malonyl CoA, ethylmalonyl CoA, or 2-hydroxymalonyl CoA 
specific AT; deleting any one, two, or all three of the KR, DH, and ER; and/or replacing 
any one, two, or all three of the KR, DH, and ER with another KR, DH, and/or ER. In 
addition, the KS and/or ACP can be replaced with another KS and/or ACP. In each of 
these replacements, the heterologous KS, AT, DH, KR, ER, or ACP coding sequence can 
originate from a coding sequence for another module of the FK-520 PKS, from a coding 
sequence for a PKS that produces a polyketide other than FK-520, or from chemical 
synthesis. The resulting heterologous ninth extender module coding sequence can be 
utilized in conjunction with a PKS that synthesizes FK-520, an FK-520 derivative, or 
another polyketide. In similar fashion, the corresponding domains in a module of a 
heterologous PKS can be replaced by one or more domains of the ninth extender module 
of the FK-520 PKS. 
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The tenth extender module of the FK-520 PKS includes a KS, an AT specific for 
malonyl CoA, and an ACP. The recombinant DNA compounds of the invention that 
encode the tenth extender module of the FK-520 PKS and the corresponding polypeptides 
encoded thereby are useful for a variety of applications. In one embodiment, a DNA 
5 compound comprising a sequence that encodes the FK-520 tenth extender module is 
inserted into a DNA compound that comprises the coding sequence for a heterologous 
PKS. The resulting construct, in which the coding sequence for a module of the 
heterologous PKS is either replaced by that for the tenth extender module of the FK-520 
PKS or the latter is merely added to coding sequences for the modules of the 
10 heterologous PKS, provides a novel PKS coding sequence. In another embodiment, a 
DNA compound comprising a sequence that encodes the tenth extender module of the 
FK-520 PKS is inserted into a DNA compound that comprises the coding sequence for 
the remainder of the FK-520 PKS or a recombinant FK-520 PKS that produces an FK- 
520 derivative. 

1 5 In another embodiment, a portion or all of the tenth extender module coding 

sequence is utilized in conjunction with other PKS coding sequences to create a hybrid 
module. In this embodiment, the invention provides, for example, either replacing the 
malonyl CoA specific AT with a methylmalonyl CoA, ethylmalonyl CoA, or 2- 
hydroxymalonyl CoA specific AT; and/or inserting a KR 5 a KR and DH, or a KR, DH, 

20 and an ER. In addition, the KS and/or ACP can be replaced with another KS and/or ACP. 
In each of these replacements or insertions, the heterologous KS, AT, DH, KR, ER, or 
ACP coding sequence can originate from a coding sequence for another module of the 
FK-520 PKS, from a coding sequence for a PKS that produces a polyketide other than 
FK-520, or from chemical synthesis. The resulting heterologous tenth extender module 

25 coding sequence can be utilized in conjunction with a coding sequence for a PKS that 
synthesizes FK-520, an FK-520 derivative, or another polyketide. In similar fashion, the 
corresponding domains in a module of a heterologous PKS can be replaced by one or 
more domains of the tenth extender module of the FK-520 PKS. 



30 module of the PKS is then attached to pipecolic acid and cyclized to form FK-520. The 



The FK-520 polyketide precursor produced by the action of the tenth extender 
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enzyme FkbP is the NRPS like enzyme that catalyzes these reactions. FkbP also includes 
a thioesterase activity that cleaves the nascent FK-520 polyketide from the NRPS. The 
present invention provides recombinant DNA compounds that encode the fkbP gene and 
so provides recombinant methods for expressing the fkbP gene product in recombinant 
host cells. The recombinant fkbP genes of the invention include those in which the coding 
sequence for the adenylation domain has been mutated or replaced with coding sequences 
from other NRPS like enzymes so that the resulting recombinant FkbP incorporates a 
moiety other than pipecolic acid. For the construction of host cells that do not naturally 
produce pipecolic acid, the present invention provides recombinant DNA compounds that 
express the enzymes that catalyze at least some of the biosynthesis of pipecolic acid (see 
Nielsen et ai 9 1991, Biochem. 30: 5789-96). The fkbL gene encodes a homolog of RapL, 
a lysine cyclodeaminase responsible in part for producing the pipecolate unit added to the 
end of the polyketide chain. The JkbB and jkbL recombinant genes of the invention can be 
used in heterologous hosts to produce compounds such as FK-520 or, in conjunction with 
other PKS or NRPS genes, to produce known or novel polyketides and non-ribosmal 
peptides. 

The present invention also provides recombinant DNA compounds that encode 
the P450 oxidase and methyltransferase genes involved in the biosynthesis of FK-520. 
Figure 2 shows the various sites on the FK-520 polyketide core structure at which these 
enzymes act. By providing these genes in recombinant form, the present invention 
provides recombinant host cells that can produce FK-520. This is accomplished by 
introducing the recombinant PKS, P450 oxidase, and methyltransferase genes into a 
heterologous host cell. In a preferred embodiment, the heterologous host cell is 
Streptomyces coelicolor CH999 or Streptomyces lividans K4-1 14, as described in U.S. 
Patent No. 5,830,750 and U.S. patent application Serial Nos. 08/828,898, filed 31 Mar. 
1997, and 09/181,833, filed 28 Oct. 1998, each of which is incorporated herein by 
reference. In addition, by providing recombinant host cells that express only a subset of 
these genes, the present invention provides methods for making FK-520 precursor 
compounds not readily obtainable by other means. 
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In a related aspect, the present invention provides recombinant DNA compounds 
and vectors that are useful in generating, by homologous recombination, recombinant 
host cells that produce FK-520 precursor compounds. In this aspect of the invention, a 
native host cell that produces FK-520 is transformed with a vector (such as an SCP2* 
5 derived vector for Streptomyces host cells) that encodes one or more disrupted genes (i.e., 
a hydroxylase, a methyltransferase, or both) or merely flanking regions from those genes. 
When the vector integrates by homologous recombination, the native, functional gene is 
deleted or replaced by the non-functional recombinant gene, and the resulting host cell 
thus produces an FK-520 precursor. Such host cells can also be complemented by 
10 introduction of a modified form of the deleted or mutated non-functional gene to produce 
a novel compound. 

In one important embodiment, the present invention provides a hybrid PKS and 
the corresponding recombinant DNA compounds that encode those hybrid PKS enzymes. 
For purposes of the present invention a hybrid PKS is a recombinant PKS that comprises 

1 5 all or part of one or more modules and thioesterase/cyclase domain of a first PKS and all 
or part of one or more modules, loading module, and thioesterase/cyclase domain of a 
second PKS. In one preferred embodiment, the first PKS is all or part of the FK-520 
PKS, and the second PKS is only a portion or all of a non-FK-520 PKS. 

One example of the preferred embodiment is an FK-520 PKS in which the AT 

20 domain of module 8, which specifies a hydroxymalonyl Co A and from which the C-13 
methoxy group of FK-520 is derived, is replaced by an AT domain that specifies a 
malonyl, methylmalonyl, or ethylmalonyl CoA. Examples of such replacement AT 
domains include the AT domains from modules 3, 12, and 13 of the rapaymycin PKS and 
from modules 1 and 2 of the erythromycin PKS. Such replacements, conducted at the 

25 level of the gene for the PKS, are illustrated in the examples below. Another illustrative 
example of such a hybrid PKS includes an FK-520 PKS in which the natural loading 
module has been replaced with a loading module of another PKS. Another example of 
such a hybrid PKS is an FK-520 PKS in which the AT domain of module three is 
replaced with an AT domain that binds methylmalonyl CoA. 
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In another preferred embodiment, the first PKS is most but not all of a non-FK- 
520 PKS, and the second PKS is only a portion or all of the FK-520 PKS. An illustrative 
example of such a hybrid PKS includes an erythromycin PKS in which an AT specific for 
methylmalonyl CoA is replaced with an AT from the FK-520 PKS specfic for malonyl 
5 CoA. 

Those of skill in the art will recognize that all or part of either the first or second 
PKS in a hybrid PKS of the invention need not be isolated from a naturally occurring 
source. For example, only a small portion of an AT domain determines its specificity. See 
U.S. provisional patent application Serial No. 60/091,526, incorporated herein by 
10 reference. The state of the art in DNA synthesis allows the artisan to construct de novo 
H DNA compounds of size sufficient to construct a useful portion of a PKS module or 

sr. 

^: domain. For purposes of the present invention, such synthetic DNA compounds are 

& 

riJ deemed to be a portion of a PKS. 

m Thus, the hybrid modules of the invention are incorporated into a PKS to provide 

p 15 a hybrid PKS of the invention. A hybrid PKS of the invention can result not only: 

(i) from fusions of heterologous domain (where heterologous means the domains 
in that module are from at least two different naturally occurring modules) coding 

U sequences to produce a hybrid module coding sequence contained in a PKS gene whose 

W 

p product is incorporated into a PKS, 

f: " 20 but also: 

(ii) from fusions of heterologous module (where heterologous module means two 
modules are adjacent to one another that are not adjacent to one another in naturally 
occurring PKS enzymes) coding sequences to produce a hybrid coding sequence 
contained in a PKS gene whose product is incorporated into a PKS, 

25 (iii) from expression of one or more FK-520 PKS genes with one or more non- 

FK-520 PKS genes, including both naturally occurring and recombinant non-FK-520 
PKS genes, and 

(iv) from combinations of the foregoing. 
Various hybrid PKSs of the invention illustrating these various alternatives are described 

30 herein. 
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Examples of the production of a hybrid PKS by co-expression of PKS genes from 
the FK-520 PKS and another non-FK-520 PKS include hybrid PKS enzymes produced by 
coexpression of FK-520 and rapamycin PKS genes. Preferably, such hybrid PKS 
enzymes are produced in recombinant Streptomyces host cells that produce FK-520 or 
5 FK-506 but have been mutated to inactivate the gene whose function is to be replaced by 
the rapamycin PKS gene introduced to produce the hybrid PKS. Particular examples 
include (i) replacement of the JkbC gene with the rapB gene; and (ii) replacement of the 
fkbA gene with the rapC gene. The latter hybrid PKS produces 13,15-didesmethoxy-FK- 
520, if the host cell is an FK-520 producing host cell, and 13,15-didesmethoxy-FK-506, 
10 if the host cell is an FK-506 producing host cell. The compounds produced by these 
r«% hybrid PKS enzymes are immunosuppressants and neurotrophins but can be readily 

modified to act only as neurotrophins, as described in Example 6, below. 

iVr. 

Fiji Other illustrative hybrid PKS enzymes of the invention are prepared by replacing 

the fkbA gene of an FK-520 or FK-506 producing host cell with a hybrid fkbA gene in 
[U 1 5 which: (a) the extender module 8 through 10, inclusive, coding sequences have been 

a replaced by the coding sequnces for extender modules 12 to 14, inclusive, of the 

^1 rapamycin PKS; and (b) the module 8 coding sequences have been replaced by the 

□ module 8 coding sequence of the rifamycin PKS. When expressed with the other, 

t j ii 

jj'j naturally occurring FK-520 or FK-506 PKS genes and the genes of the modification 

^ 20 enzymes, the resulting hybrid PKS enzymes produce, respectively, (a) 1 3-desmethoxy- 

FK-520 or 13-desmethoxy-FK-506; and (b) 13-desmethoxy-13-methyl-FK-520 or 13- 
desmethoxy- 1 3-methyl-FK-506. In a preferred embodiment, these recombinant PKS 
genes of the invention are introduced into the producing host cell by a vector such as 
pHU204, which is a plamsid pRM5 derivative that has the well-characterized SCP2* 
25 replicon, the colEl replicon, the tsr and bla resistance genes, and a cos site. This vector 
can be used to introduce the recombinant fkbA replacement gene in an FK-520 or FK-506 
producing host cell (or a host cell derived therefrom in which the endogenous fkbA gene 
has either been rendered inactive by mutation, deletion or homologous recombination 
with the gene that replaces it) to produce the desired hybrid PKS. 
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In constructing hybrid PKSs of the invention, certain general methods may be 
helpful. For example, it is often beneficial to retain the framework of the module to be 
altered to make the hybrid PKS. Thus, if one desires to add DH and ER functionalities to 
a module, it is often preferred to replace the KR domain of the original module with a 
KR, DH, and ER domain-containing segment from another module, instead of merely 
inserting DH and ER domains. One can alter the stereochemical specificity of a module 
by replacement of the KS domain with a KS domain from a module that specifies a 
different stereochemistry. See Lau et al, 1999, "Dissecting the role of acyltransferase 
domains of modular polyketide synthases in the choice and stereochemical fate of 
extender units," Biochemistry 35(5): 1643-165 1, incorporated herein by reference. 
Stereochemistry can also be changed by changing the KR domain. Also, one can alter the 
specificity of an AT domain by changing only a small segment of the domain. See Lau et 
aL, supra. One can also take advantage of known linker regions in PKS proteins to link 
modules from two different PKSs to create a hybrid PKS. See Gokhale et al, 16 Apr. 
1999, "Dissecting and Exploiting Intermodular Communication in Polyketide Synthases," 
Science 284: 482-485, incorporated herein by reference. 

The following Table lists references describing illustrative PKS genes and 
corresponding enzymes that can be utilized in the construction of the recombinant PKSs 
and the corresponding DNA compounds that encode them of the invention. Also 
presented are various references describing tailoring enzymes and corresponding genes 
that can be employed in accordance with the methods of the present invention. 
Avermectin 

U.S. Pat. No. 5,252,474 to Merck. 

MacNeil et ai, 1993, Industrial Microorganisms: Basic and Applied Molecular 
Genetics , Baltz, Hegeman, & Skatrud, eds. (ASM), pp. 245-256, A Comparison of the 
Genes Encoding the Polyketide Synthases for Avermectin, Erythromycin, and 
Nemadectin. 

MacNeil et aL, 1992, Gene 115: 1 19-125, Complex Organization of the 
Streptomyces avermitilis genes encoding the avermectin polyketide synthase. 
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Ikeda et al, Aug. 1999, Organization of the biosynthetic gene cluster for the 
polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis, Proc. Natl. 
Acad. Sci. USA 96: 9509-9514. 
Candicidin (FR008) 

Hu etal., 1994, M)/. Microbiol. 14: 163-172. 
Epothilone 

U.S. Pat. App. Serial No. 60/1 30,560, filed 22 April 1 999. 
Erythromycin 

PCT Pub. No. 93/13663 to Abbott. 
US Pat. No. 5,824,513 to Abbott. 
Donadio et al, 1991, Science 252:675-9. 

Cortes et al, 8 Nov. 1990, Nature 348:\16-S, An unusually large 
multifunctional polypeptide in the erythromycin producing polyketide synthase of 
Saccharopolyspora erythraea. 

Glycosylation Enzymes 

PCT Pat. App. Pub. No. 97/23630 to Abbott. 
FK-506 

Motamedi et al, 1998, The biosynthetic gene cluster for the macrolactone ring of 
the immunosuppressant FK-506, Eur. J. biochem. 256: 528-534. 

Motamedi et al, 1997, Structural organization of a multifunctional polyketide 
synthase involved in the biosynthesis of the macrolide immunosuppressant FK-506, Eur. 
J. Biochem. 244: 74-80. 

Methyltransferase 

US 5,264,355, issued 23 Nov. 1993, Methylating enzyme from 
Streptomyces MA6858. 31-O-desmethyl-FK-506 methyltransferase. 

Motamedi et al, 1996, Characterization of methyltransferase and 
hydroxylase genes involved in the biosynthesis of the immunosuppressants FK-506 and 
FK-520,y. Bacteriol. 178: 5243-5248. 
Streptomyces hygroscopicus 

U.S. patent application Serial No. 09/154,083, filed 16 Sep. 1998. 
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Lovastatin 

U.S. Pat. No. 5,744,350 to Merck. 
Narbomycin 

U.S. patent application Serial No. 60/107,093, filed 5 Nov. 1998, and Serial No. 
60/120,254, filed 16 Feb. 1999. 
Nemadectin 

MacNeile/a/., 1993, supra 
Niddamycin 

Kakavas et al. r 1997, Identification and characterization of the niddamycin 
polyketide synthase genes from Streptomyces caelestis, J. Bacteriol 179: 7515-7522. 
Oleandomycin 

Swan et al. 9 1994, Characterisation of a Streptomyces antibioticus gene encoding 
a type I polyketide synthase which has an unusual coding sequence, Mol. Gen. Genet. 
242:358-362. 

U.S. patent application Serial No. 60/120,254, filed 16 Feb. 1999. 

Olano et al. 9 1998, Analysis of a Streptomyces antibioticus chromosomal region 
involved in oleandomycin biosynthesis, which encodes two glycosyltransferases 
responsible for glycosylation of the macrolactone ring, Mol. Gen. Genet. 259(3): 299- 
308. 

Picromycin 

PCT patent application US99/15047, filed 2 Jul. 1999. 
Xue et al, 1998, Hydroxylation of macrolactones YC-17 and narbomycin is 
mediated by the/?/£C-encoded cytochrome P450 in Streptomyces venezuelae, Chemistry 
& Biology 5(11): 661-667. 

Xue et al, Oct. 1998, A gene cluster for macrolide antibiotic biosynthesis in 
Streptomyces venezuelae: Architecture of metabolic diversity, Proc. Natl. Acad. Sci. 
USA 95: 12111 12116. 
Platenolide 

EP Pat. App. Pub. No. 791 ,656 to Lilly. 
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Rapamycin 

Schwecke et al, Aug. 1995, The biosynthetic gene cluster for the 
polyketide rapamycin, Proc. Natl. Acad. Sci. USA 92:7839-7843. 

Aparicio et al, 1996, Organization of the biosynthetic gene cluster for rapamycin 
in Streptomyces hygroscopicus: analysis of the enzymatic domains in the modular 
polyketide synthase, Gene 169: 9-16. 
Rifamycin 

August et al 9 13 Feb. 1998, Biosynthesis of the ansamycin antibiotic rifamycin: 
deductions from the molecular analysis of the n/biosynthetic gene cluster of 
Amycolatopsis mediterranei S669, Chemistry & Biology, 5(2): 69-79. 
Sorangium PKS 

U.S. patent application Serial No. 09/144,085, filed 31 Aug. 1998. 
Soraphen 

U.S. Pat. No. 5,716,849 to Novartis. 

Schupp et aL 9 1995, J. Bacteriology 177: 3673-3679. A Sorangium cellulosum 
(Myxobacterium) Gene Cluster for the Biosynthesis of the Macrolide Antibiotic 
Soraphen A: Cloning, Characterization, and Homology to Polyketide Synthase Genes 
from Actinomycetes. 
Spiramycin 

U.S. Pat. No. 5,098,837 to Lilly. 

Activator Gene 

U.S. Pat. No. 5,514,544 to Lilly. 
Tylosin 

EP Pub. No. 791,655 to Lilly. 
U.S. Pat. No. 5,876,991 to Lilly. 

Kuhstoss et aL, 1996, Gene 183:23 1-6., Production of a novel polyketide through 
the construction of a hybrid polyketide synthase. 
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Tailoring enzymes 

Merson-Davies and Cundliffe, 1994, Mol Microbiol 13: 349-355. Analysis of 
five tylosin biosynthetic genes from the tylBA region of the Streptomyces fradiae 
genome. 

5 As the above Table illustrates, there are a wide variety of polyketide synthase 

genes that serve as readily available sources of DNA and sequence information for use in 
constructing the hybrid PKS-encoding DNA compounds of the invention. Methods for 
constructing hybrid PKS-encoding DNA compounds are described without reference to 
the FK-520 PKS in PCT patent publication No. 98/51695; U.S. Patent Nos. 5,672,491 
10 and 5,712,146 and U.S. patent application Serial Nos. 09/073,538, filed 6 May 1998, and 
p 09/141,908, filed 28 Aug 1998, each of which is incorporated herein by reference. 

The hybrid PKS-encoding DNA compounds of the invention can be and often are 
ft! hybrids of more than two PKS genes. Moreover, there are often two or more modules in 

the hybrid PKS in which all or part of the module is derived from a second (or third) 
1 5 PKS. Thus, as one illustrative example, the present invention provides a hybrid FK-520 
PKS that contains the naturally occurring loading module and FkbP as well as modules 
one, two, four, six, seven, and eight, nine, and ten of the FK-520 PKS and further 
contains hybrid or heterologous modules three and five. Hybrid or heterologous module 
three contains an AT domain that is specific of methylmalonyl CoA and can be derived 
20 for example, from the erythromycin or rapamycin PKS genes. Hybrid or heterologous 
module five contains an AT domain that is specific for malonyl CoA and can be derived 
for example, from the picromycin or rapamycin PKS genes. 

While an important embodiment of the present invention relates to hybrid PKS 
enzymes and corresponding genes, the present invention also provides recombinant FK- 
25 520 PKS genes in which there is no second PKS gene sequence present but which differ 
from the FK-520 PKS gene by one or more deletions. The deletions can encompass one 
or more modules and/or can be limited to a partial deletion within one or more modules. 
When a deletion encompasses an entire module, the resulting FK-520 derivative is at 
least two carbons shorter than the gene from which it was derived. When a deletion is 
30 within a module, the deletion typically encompasses a KR, DH, or ER domain, or both 
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DH and ER domains, or both KR and DH domains, or all three KR, DH, and ER 
domains. 

To construct a hybrid PKS or FK-520 derivative PKS gene of the invention, one 
can employ a technique, described in PCT Pub. No. 98/27203 and U.S. patent application 
5 Serial No. 08/989,332, filed 1 1 Dec. 1997, each of which is incorporated herein by 
reference, in which the large PKS gene is divided into two or more, typically three, 
segments, and each segment is placed on a separate expression vector. In this manner, 
each of the segments of the gene can be altered, and various altered segments can be 
combined in a single host cell to provide a recombinant PKS gene of the invention. This 

1 0 technique makes more efficient the construction of large libraries of recombinant PKS 
genes, vectors for expressing those genes, and host cells comprising those vectors. 

Thus, in one important embodiment, the recombinant DNA compounds of the 
invention are expression vectors. As used herein, the term expression vector refers to any 
nucleic acid that can be introduced into a host cell or cell-free transcription and 

15 translation medium. An expression vector can be maintained stably or transiently in a 
cell, whether as part of the chromosomal or other DNA in the cell or in any cellular 
compartment, such as a replicating vector in the cytoplasm. An expression vector also 
comprises a gene that serves to produce RNA that is translated into a polypeptide in the 
cell or cell extract. Furthermore, expression vectors typically contain additional 

20 functional elements, such as resistance-conferring genes to act as selectable markers. 

The various components of an expression vector can vary widely, depending on 
the intended use of the vector. In particular, the components depend on the host cell(s) in 
which the vector will be used or is intended to function. Vector components for 
expression and maintenance of vectors in E. coli are widely known and commercially 

25 available, as are vector components for other commonly used organisms, such as yeast 
cells and Streptomyces cells. 

In a preferred embodiment, the expression vectors of the invention are used to 
construct recombinant Streptomyces host cells that express a recombinant PKS of the 
invention. Preferred Streptomyces host cell/vector combinations of the invention include 

30 S, coelicolor CH999 and S. lividans K4-1 14 host cells, which do not produce 
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actinorhodin, and expression vectors derived from the pRMl and pRM5 vectors, as 
described in U.S. Patent No. 5,830,750 and U.S. patent application Serial Nos. 
08/828,898, filed 31 Mar. 1997, and 09/181,833, filed 28 Oct. 1998, each of which is 
incorporated herein by reference. 

The present invention provides a wide variety of expression vectors for use in 
Streptomyces. For replicating vectors, the origin of replication can be, for example and 
without limitation, a low copy number vector, such as SCP2* (see Hopwood et al, 
Genetic Manipulation of Streptomyces: A Laboratory manual (The John Innes 
Foundation, Norwich, U.K., 1985); Lydiate et al. 9 1985, Gene 35: 223-235; and Kieser 
and Melton, 1988, Gene 65: 83-91, each of which is incorporated herein by reference), 
SLP1.2 (Thompson et al. 9 1982, Gene 20: 51-62, incorporated herein by reference), and 
SG5(ts) (Muth et aL, 1989, Mol Gen. Genet 219: 341-348, and Bierman et a/., 1992, 
Gene 116: 43-49, each of which is incorporated herein by reference), or a high copy 
number vector, such as pIJlOl and pJVl (see Katz et al., 1983, J. Gen. Microbiol. 129: 
2703-2714; Vara et al. 9 1989, J. Bacteriol. 171: 5782-5781; and Servin-Gonzalez, 1993, 
Plasmid 30: 131-140, each of which is incorporated herein by reference). Generally, 
however, high copy number vectors are not preferred for expression of genes contained 
on large segments of DNA. For non-replicating and integrating vectors, it is useful to 
include at least an E. coli origin of replication, such as from pUC, plP, pi I, and pBR. For 
phage based vectors, the phages phiC3 1 and KC5 15 can be employed (see Hopwood et 
aL, supra). 

Typically, the expression vector will comprise one or more marker genes by 
which host cells containing the vector can be identified and/or selected. Useful antibiotic 
resistance conferring genes for use in Streptomyces host cells include the ermE (confers 
resistance to erythromycin and other macrolides and lincomycin), tsr (confers resistance 
to thiostrepton), aadA (confers resistance to spectinomycin and streptomycin), aacC4 
(confers resistance to apramycin, kanamycin, gentamicin, geneticin (G418), and 
neomycin), hyg (confers resistance to hygromycin), and vph (confers resistance to 
viomycin) resistance conferring genes. 
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The recombinant PKS gene on the vector will be under the control of a promoter, 
typically with an attendant ribosome binding site sequence. The present invention 
provides the endogenous promoters of the FK-520 PKS and related biosynthetic genes in 
recombinant form, and these promoters are preferred for use in the native hosts and in 
heterologous hosts in which the promoters function. A preferred promoter of the 
invention is the flcbO gene promoter, comprised in a sequence of about 270 bp between 
the start of the open reading frames of the fkbO and flcbB genes. The JkbO promoter is 
believed to be bi-directional in that it promotes transcription of the genes JkbO,fkbP, and 
JkbA in one direction and JkbB.JkbC, and JkbL in the other. Thus, in one aspect, the 
present invention provides a recombinant expression vector comprising the promoter of 
the JkbO gene of an FK-520 producing organism positioned to transcribe a gene other 
than fkbO. In a preferred embodiment the transcribed gene is an FK-520 PKS gene. In 
another preferred embodiment, the transcribed gene is a gene that encodes a protein 
comprised in a hybrid PKS. 

Heterologous promoters can also be employed and are preferred for use in host 
cells in which the endogenous FK-520 PKS gene promoters do not function or function 
poorly. A preferred heterologous promoter is the actl promoter and its attendant activator 
gene actII-ORF4, which is provided in the pRMl and pRM5 expression vectors, supra. 
This promoter is activated in the stationary phase of growth when secondary metabolites 
are normally synthesized. Other useful Streptomyces promoters include without limitation 
those from the ermE gene and the melCl gene, which act constitutively, and the UpA 
gene and the merA gene, which can be induced at any growth stage. In addition, the T7 
RNA polymerase system has been transferred to Streptomyces and can be employed in 
the vectors and host cells of the invention. In this system, the coding sequence for the T7 
RNA polymerase is inserted into a neutral site of the chromosome or in a vector under the 
control of the inducible merA promoter, and the gene of interest is placed under the 
control of the T7 promoter. As noted above, one or more activator genes can also be 
employed to enhance the activity of a promoter. Activator genes in addition to the actll- 
ORF4 gene discussed above include dnrl, redD, and ptpA genes (see U.S. patent 
application Serial No. 09/181,833, supra) to activate promoters under their control. 
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In addition to providing recombinant DNA compounds that encode the FK-520 
PKS, the present invention also provides DNA compounds that encode the ethylmalonyl 
CoA and 2-hydroxymalonyl CoA utilized in the synthesis of FK-520. Thus, the present 
invention also provides recombinant host cells that express the genes required for the 
biosynthesis of ethylmalonyl CoA and 2-hydroxymalonyl CoA. Figures 3 and 4 show the 
location of these genes on the cosmids of the invention and the biosynthetic pathway that 
produces ethylmalonyl CoA. 

For 2-hydroxymalonyl CoA biosynthesis, the flcbH,fkbI,flcbJ, and JkbK genes are 
sufficient to confer this ability on Streptomcyces host cells. For conversion of 2- 
hydroxymalonyl to 2-methoxymalonyl, the jkbG gene is also employed. While the 
complete coding sequence for flcbH is provided on the cosmids of the invention, the 
sequence for this gene provided herein may be missing a T residue, based on a 
comparison made with a similar gene cloned from the ansamitocin gene cluster by Dr. H. 
Floss. Where the sequence herein shows one T, there may be two, resulting in an 
extension of the fkbH reading frame to encode the amino acid sequence: 
MTIVKCLVWDLDNTLWRGTVLEDDEVVLTDEIREVITTLDDRGILQAVASKNDH 
DLAWERLERLGVAEYFVLARIGWGPKSQSVREIATELNFAPTTIAFIDDQPAERA 
EVAFHLPEVRCYPAEQAATLLSLPEFSPPVSTVDSRRRRLMYQAGFARDQAREA 
YSGPDEDFLRSLDLSMTIAPAGEEELSRVEELTLRTSQMNATGVHYSDADLRALL 
TDPAHEVLVVTMGDRFGPHGAVGIILLEKKPSTWHLKLLATSCRVVSFGAGATIL 
NWLTDQGARAGAHLVADFRRTDRNRMMEIAYRFAGFADSDCPCVSEVAGASA 
AGVERLHLEPSARPAPTTLTLTAADIAPVTVSAAG. 

For ethylmalonyl CoA biosynthesis, one requires only a crotonyl CoA reductase, 
which can be supplied by the host cell but can also be supplied by recombinant 
expression of the fkbS gene of the present invention. To increase yield of ethylmalonyl 
CoA, one can also express the jkbE and fkbU genes as well. While such production can 
be achieved using only the recombinant genes above, one can also achieve such 
production by placing into the recombinant host cell a large segment of the DNA 
provided by the cosmids of the invention. Thus, for 2-hydroxymalonyl and 2- 
methoxymalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
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DNA located on the left side of the FK-520 PKS genes shown in Figure 1 . For 
ethylmalonyl CoA biosynthesis, one can simply provide the cells with the segment of 
DNA located on the right side of the FK-520 PKS genes shown in Figure 1 or, 
alternatively, both the right and left segments of DNA. 

The recombinant DNA expression vectors that encode these genes can be used to 
construct recombinant host cells that can make these important polyketide building 
blocks from cells that otherwise are unable to produce them. For example, Strepiomyces 
coelicolor and Strepiomyces lividans do not synthesisze ethylmalonyl CoA or 2- 
hydroxymalonyl CoA. The invention provides methods and vectors for constructing 
recombinant Streptomyces coelicolor and Streptomyces lividans that are able to 
synthesize either or both ethylmalonyl CoA and 2-hydroxymalonyl CoA. These host cells 
are thus able to make polyketides, those requiring these substrates, that cannot otherwise 
be made in such cells. 

In a preferred embodiment, the present invention provides recombinant 
Streptomyces host cells, such as S. coelicolor and S. lividans, that have been transformed 
with a recombinant vector of the invention that codes for the expression of the 
ethylmalonyl CoA biosynthetic genes. The resulting host cells produce ethylmalonyl 
CoA and so are preferred host cells for the production of polyketides produced by PKS 
enzymes that comprise one or more AT domains specific for ethylmalonyl CoA. 
Illustrative PKS enzymes of this type include the FK-520 PKS and a recombinant PKS in 
which one or more AT domains is specific for ethylmalonyl CoA. 

In a related embodiment, the present invention provides Streptomyces host cells in 
which one or more of the ethylmalonyl or 2-hydroxymalonyl biosynthetic genes have 
been deleted by homologous recombination or rendered inactive by mutation. For 
example, deletion or inactivation of the fkbG gene can prevent formation of the methoxyl 
groups at C- 13 and C- 15 of FK-520 (or, in the corresponding FK-506 producing cell, FK- 
506), leading to the production of 13,15-didesmethoxy-13,15-dihydroxy-FK-520 (or, in 
the corresponding FK-506 producing cell, 13,15-didesmethoxy-13,15-dihydroxy-FK- 
506). If the fkbG gene product acts on 2-hydroxymalonyl and the resulting 2- 
methoxymalonyl substrate is required for incorporation by the PKS, the AT domains of 
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modules 7 and 8 may bind malonyl CoA and methylmalonyl CoA. Such incorporation 
results in the production of a mixture of polyketides in which the methoxy groups at C-13 
and C-l 5 of FK-520 (or FK-506) are replaced by either hydrogen or methyl. ~" 

This possibility of non-specific binding results from the construction of a hybrid 
PKS of the invention in which the AT domain of module 8 of the FK-520 PKS replaced 
the AT domain of module 6 of DEBS. The resulting PKS produced, in Streptomyces 
lividans, 6-dEB and 2-desmethyl-6-dEB, indicating that the AT domain of module 8 of 
the FK-520 PKS could bind malonyl CoA and methylmalonyl CoA substrates. Thus, one 
could possibly also prepare the 13,15-didesmethoxy-FK-520 and corresponding FK-506 
compounds of the invention by deleting or otherwise inactivating one or more or all of 
the genes required for 2-hydroxymalonyl CoA biosynthesis, i.e., the flcbH,fkbI,fkbJ, and 
flcbK genes. In any event, the deletion or inactivation of one or more biosynthetic genes 
required for ethylmalonyl and/or 2-hydroxymalonyl production prevents the formation of 
polyketides requiring ethylmalonyl and/or 2-hydroxymalonyl for biosynthesis, and the 
resulting host cells are thus preferred for production of polyketides that do not require the 
same. 

The host cells of the invention can be grown and fermented under conditions 
known in the art for other purposes to produce the compounds of the invention. See, e.g., 
U.S. Patent Nos. 5,194,378; 5,1 16,756; and 5,494,820, incorporated herein by reference, 
for suitable fermentation processes. The compounds of the invention can be isolated from 
the fermentation broths of these cultured cells and purified by standard procedures. 
Preferred compounds of the invention include the following compounds: 13-desmethoxy- 
FK-506; 13-desmethoxy-FK-520; 13,15-didesmethoxy-FK-506; 13,15-didesmethoxy- 
FK-520; 1 3-desmethoxy- 1 8-hydroxy-FK-506; 1 3-desmethoxy- 1 8-hydroxy-FK-520; 
13,1 5-didesmethoxy- 1 8-hydroxy-FK-506; and 1 3, 1 5-didesmethoxy- 1 8-hydroxy-FK-520. 
These compounds can be further modified as described for tacrolimus and FK-520 in 
U.S. Patent Nos. 5,225,403; 5,189,042; 5,164,495; 5,068,323; 4,980,466; and 4,920,218, 
incorporated herein by reference. 

Other compounds of the invention are shown in Figure 8, Parts A and B. In Figure 
8, Part A, illustrative C-32-substituted compounds of the invention are shown in two 
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columns under the heading R. The substituted compounds are preferred for topical 
administration and are applied to the dermis for treatment of conditions such as psoriasis. 
In Figure 8, Part B, illustrative reaction schemes for making the compounds shown in 
Figure 8, Part A, are provided. In the upper scheme in Figure 8, Part B, the C-32 
5 substitution is a tetrazole moiety, illustrative of the groups shown in the left column 
under R in Figure 8, Part A. In the lower scheme in Figure 8, Part B, the C-32 
substitution is a disubstituted amino group, where R3 and R4 can be any group similar to 
the illustrative groups shown attached to the amine in the right column under R in Figure 
8, Part A. While Figure 8 shows the C-32-substituted compounds in which the C-15- 
10 methoxy is present, the invention includes these C-32-substituted compounds in which C- 
15 is ethyl, methyl, or hydrogen. Also, while C-21 is shown as substituted with ethyl or 
allyl, the compounds of the invention includes the C-32-substituted compounds in which 
C-21 is substituted with hydrogen or methyl. 

To make these C-32-substituted compounds, Figure 8, Part B, provides illustrative 
1 u 15 reaction schemes. Thus, a selective reaction of the starting compound (see Figure 8, Part 

B, for an illustrative starting compound) with trifluoromethanesulfonic anhydride in the 
presence of a base yields the C-32 O-triflate derivative, as shown in the upper scheme of 
Figure 8, Part B. Displacement of the triflate with lH-tetrazole or triazole derivatives 
provides the C-32 tetrazole or teiazole derivative. As shown in the lower scheme of 
20 Figure 8, Part B, reacting the starting compound with p-nitrophenylchloroformate yields 
the correspoinding carbonate, which, upon displacement with an amino compound, 
provides the corresponding carbamate derivative. 

The compounds can be readily formulated to provide the pharmaceutical 
compositions of the invention. The pharmaceutical compositions of the invention can be 
25 used in the form of a pharmaceutical preparation, for example, in solid, semisolid, or 
liquid form. This preparation contains one or more of the compounds of the invention as 
an active ingredient in admixture with an organic or inorganic carrier or excipient 
suitable for external, enteral, or parenteral application. The active ingredient may be 
compounded, for example, with the usual non-toxic, pharmaceutically acceptable carriers 
30 for tablets, pellets, capsules, suppositories, solutions, emulsions, suspensions, and any 
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other form suitable for use. Suitable formulation processes and compositions for the 
compounds of the present invention are described with respect to tacrolimus in U.S. 
Patent Nos. 5,939,427; 5,922,729; 5,385,907; 5,338,684; and 5,260,301, incorporated 
herein by reference. Many of the compounds of the invention contain one or more chiral 
5 centers, and all of the stereoisomers are included within the scope of the invention, as 
pure compounds as well as mixtures of stereoisomers. Thus the compounds of the 
invention may be supplied as a mixture of stereoisomers in any proportion. 

The carriers which can be used include water, glucose, lactose, gum acacia, 
gelatin, mannitol, starch paste, magnesium trisilicate, talc, corn starch, keratin, colloidal 
10 silica, potato starch, urea, and other carriers suitable for use in manufacturing 

preparations, in solid, semi-solid, or liquified form. In addition, auxiliary stabilizing, 
thickening, and coloring agents and perfumes may be used. For example, the compounds 
of the invention may be utilized with hydroxypropyl methylcellulose essentially as 
described in U.S. Patent No. 4,916,138, incorporated herein by reference, or with a 

L T 1 5 surfactant essentially as described in EPO patent publication No. 428, 169, incorporated 

f 

s herein by reference. 

j~ Oral dosage forms may be prepared essentially as described by Hondo et al. t 

?h 

1987, Transplantation Proceedings XIX, Supp. 6: 17-22, incorporated herein by 

yj 

0 reference. Dosage forms for external application may be prepared essentially as described 

1 : 

taw: 

20 in EPO patent publication No. 423,714, incorporated herein by reference. The active 
compound is included in the pharmaceutical composition in an amount sufficient to 
produce the desired effect upon the disease process or condition. 

For the treatment of conditions and diseases relating to immunosuppression or 
neuronal damage, a compound of the invention may be administered orally, topically, 

25 parenterally, by inhalation spray, or rectally in dosage unit formulations containing 

conventional non-toxic pharmaceutically acceptable carriers, adjuvant, and vehicles. The 
term parenteral, as used herein, includes subcutaneous injections, and intravenous, 
intramuscular, and intrasternal injection or infusion techniques. 

Dosage levels of the compounds of the present invention are of the order from 

30 about 0.01 mg to about 50 mg per kilogram of body weight per day, preferably from 
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about 0.1 mg to about 10 mg per kilogram of body weight per day. The dosage levels are 
useful in the treatment of the above-indicated conditions (from about 0.7 mg to about 3.5 
mg per patient per day, assuming a 70 kg patient). In addition, the compounds of the 
present invention may be administered on an intermittent basis, i.e., at semi-weekly, 

5 weekly, semi-monthly, or monthly intervals. 

The amount of active ingredient that may be combined with the carrier materials 
to produce a single dosage form will vary depending upon the host treated and the 
particular mode of administration. For example, a formulation intended for oral 
administration to humans may contain from 0.5 mg to 5 g of active agent compounded 

10 with an appropriate and convenient amount of carrier material, which may vary from 
about 5 percent to about 95 percent of the total composition. Dosage unit forms will 
generally contain from about 0.5 mg to about 500 mg of active ingredient. For external 
administration, the compounds of the invention can be formulated within the range of, for 
example, 0.00001% to 60% by weight, preferably from 0.001% to 10% by weight, and 

1 5 most preferably from about 0.005% to 0.8% by weight. The compounds and 

compositions of the invention are useful in treating disease conditions using doses and 
administration schedules as described for tacrolimus in U.S. Patent Nos. 5,542,436; 
5,365,948; 5,348,966; and 5,196,437, incorporated herein by reference. The compounds 
of the invention can be used as single therapeutic agents or in combination with other 

20 therapeutic agents. Drugs that can be usefully combined with compounds of the invention 
include one or more immunosuppressant agents such as rapamycin, cyclosporin A, FK- 
506, or one or more neurotrophic agents. 

It will be understood, however, that the specific dosage level for any particular 
patient will depend on a variety of factors. These factors include the activity of the 

25 specific compound employed; the age, body weight, general health, sex, and diet of the 
subject; the time and route of administration and the rate of excretion of the drug; 
whether a drug combination is employed in the treatment; and the severity of the 
particular disease or condition for which therapy is sought. 
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A detailed description of the invention having been provided above, the following 
examples are given for the purpose of illustrating the present invention and shall not be 



Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-520 
The C-13 methoxyl group is introduced into FK-520 via an AT domain in 
extender module 8 of the PKS that is specific for hydroxymalonyl and by methylation of 
the hydroxyl group by an S-adenosyl methionine (SAM) dependent methyltransferase. 
Metabolism of FK-506 and FK-520 primarily involves oxidation at the C-13 position into 
an inactive derivative that is further degraded by host P450 and other enzymes. The 
present invention provides compounds related in structure to FK-506 and FK-520 that do 
not contain the C-13 methoxy group and exhibit greater stability and a longer half-life in 
vivo. These compounds are useful medicaments due to their immunosuppressive and 
neurotrophic activities, and the invention provides the compounds in purified form and as 
pharmaceutical compositions. 

The present invention also provides the novel PKS enzymes that produce these 
novel compounds as well as the expression vectors and host cells that produce the novel 
PKS enzymes. The novel PKS enzymes include, among others, those that contain an AT 
domain specific for either malonyl CoA or methylmalonyl CoA in module 8 of the FK- 
506 and FK-520 PKS. This example describes the construction of recombinant DNA 
compounds that encode the novel FK-520 PKS enzymes and the transformation of host 
cells with those recombinant DNA compounds to produce the novel PKS enzymes and 
the polyketides produced thereby. 

To construct an expression cassette for performing module 8 AT domain 
replacements in the FK-520 PKS, a 4.6 kb Sphl fragment from the FK-520 gene cluster 
was cloned into plasmid pLitmus 38 (a cloning vector available from New England 
Biolabs). The 4.6 kb Sphl fragment, which encodes the ACP domain of module 7 
followed by module 8 through the KR domain, was isolated from an agarose gel after 
digesting the cosmid pKOS65-C31 with Sph I. The clone having the insert oriented so 



construed as being a limitation on the scope of the invention or claims. 



Example 1 
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the single Sacl site was nearest to the Spel end of the polylinker was identified and 

designated as plasmid pKOS60-21-67. To generate appropriate cloning sites, two linkers 

were ligated sequentially as follows. First, a linker was ligated between the Spelmd 

Sacl sites to introduce a BgRl site at the 5 ' end of the cassette, to eliminate interfering 

5 polylinker sites, and to reduce the total insert size to 4.5 kb (the limit of the phage 

KC515). The ligation reactions contained 5 picomolar unphosphorylated linker DNA and 

0.1 picomolar vector DNA, i.e., a 50-fold molar excess of linker to vector. The linker had 

the following sequence: 

5'-CTAGTGGGCAGATCTGGCAGCT-3 ' 
10 3'-ACCCGTCTAGACCG-5' 

The resulting plasmid was designated pKOS60-27-l . 

Next, a linker of the following sequence was ligated between the unique Sphl and 

Aflll sites of plasmid pKOS60-27-l to introduce an Nsil site at the 3' end of the module 8 

cassette. The linker employed was: 

15 5 ' -GGG ATGC ATGGC-3 ' 

3 '-GTACCCCTACGTACCGAATT-5 ' 

The resulting plasmid was designated pKOS60-29-55. 

To allow in-frame insertions of alternative AT domains, sites were engineered at 

the 5' end (Avr II or Nhe I) and 3' end (Xho I) of the AT domain using the polymerase 

20 chain reaction (PCR) as follows. Plasmid pKOS60-29-55 was used as a template for the 

PCR and sequence 5' to the AT domain was amplified with the primers SpeBgl-fwd and 

either Avr-rev or Nhe-rev: 

SpeBgl-fwd 5 ' -CG ACTC ACTAGTGGGC AG ATCTGG-3 ' 

Avr-rev 5'-CACGCCTAGGCCGGTCGGTCTCGGGCCAC-3 5 

25 Nhe-rev 5 ' -GCGGCTAGCTGCTCGCCC ATCGCGGGATGC-3 ' 

The PCR included, in a 50 jil reaction, 5 jil of lOx Pfu polymerase buffer 

(Stratagene), 5 |al lOx z-dNTP mixture (2 mM dATP, 2 mM dCTP, 2 mM dTTP, 1 mM 

dGTP, 1 mM 7-deaza-GTP), 5 nl DMSO, 2 \i\ of each primer (10 ^M), 1 ^1 of template 

DNA (0.1 ^ig/(al), and 1 ^il of cloned Pfu polymerase (Stratagene). The PCR conditions 

30 were 95°C for 2 min., 25 cycles at 95°C for 30 sec, 60°C for 30 sec, and 72°C for 4 
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min., followed by 4 min. at 72°C and a hold at 0°C. The amplified DNA products and 
the Litmus vectors were cut with the appropriate restriction enzymes (BgUl and ^vrll or 
Spel and Nhel), and cloned into either pLitmus 28 or pLitmus38 (New England Biolabs), 
respectively, to generate the constructs designated pKOS60-37-4 and pKOS60-37-2, 
5 respectively. 

Plasmid pKOS60-29-55 was again used as a template for PCR to amplify 
sequence 3' to the AT domain using the primers BsrXho-fwd and NsiAfl-rev: 

BsrXho-fwd 5 ' -G ATGTAC AGCTCG AGTCGGC ACGCCCGGCCGC ATC-3 ' 

NsiAfl-rev5'-CGACTCACTTAAGCCATGCATCC-3' 
1 0 PCR conditions were as described above. The PCR fragment was cut with BsrGl 

and Aflll, gel isolated, and ligated into pKOS60-37-4 cut with Aspl\ 8 and Aflll and 
inserted into pKOS60-37-2 cut with BsrGl and Aflll, to give the plasmids pKOS60-39-l 
and pKOS60-39-13, respectively. These two plasmids can be digested with Avrll and 
Xhol or Nhel and Xhol, respectively, to insert heterologous AT domains specific for 
1 5 malonyl, methylmalonyl, ethylmalonyl, or other extender units. 

Malonyl and methylmalonyl-specific AT domains were cloned from the 
rapamycin cluster using PCR amplification with a pair of primers that introduce an Avrll 
or Nhel site at the 5' end and an Xhol site at the 3' end. The PCR conditions were as 
given above and the primer sequences were as follows: 

20 

RATN 1 5 ' - ATCCTAGGCGGGCRGG YGTGTCGTCCTTCGG-3 ' 
(3' end of Rap KS sequence and universal for malonyl and methylmalonyl Co A), 
RATMN2 5 5 -ATGCTAGCCGCCGCGTTCCCCGTCTTCGCGCG-3 , 
(Rap AT shorter version 5'- sequence and specific for malonyl Co A), 
25 RATMMN2 5 '-ATGCTAGCGGATTCGTCGGTGGTGTTCGCCGA-3 ' 

(Rap AT shorter version 5'- sequence and specific for methylmalonyl CoA), and 
RATC 5'-ATCTCGAGCCAGTASCGCTGGTGYTGGAAGG-3' 
(Rap DH 5'- sequence and universal for malonyl and methylmalonyl CoA). 
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MMN2 - Nhel 



Any Rap Module 

Xhol-C 



10 Because of the high sequence similarity in each module of the rapamycin cluster, 

each primer was expected to prime any of the AT domains. PCR products representing 
ATs specific for malonyl or methylmalonyl extenders were identified by sequencing 
individual cloned PCR products. Sequencing also confirmed that the chosen clones 
contained no cloning artifacts. Examples of hybrid modules with the rapamycin AT 12 

1 5 and ATI 3 domains are shown in a separate figure. 

The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 of the 
rapamycin PKS has the DNA sequence and encodes the amino acid sequence shown 
below. The AT of rap module 12 is specific for incorporation of malonyl units. 

20 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 
IWQLAEALLTLVREST 
GCCSCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
25 FKDLGI DSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGV RLNATAVFD 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPTPHVLAGKLGDELTG 
30 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 
35 ASPEELWHLVASGT DAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

TEFPTDRGWDVDAIYD 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAIGKTFVRHGGFL 
40 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
TGATG FDAAFF G I S PRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

ALAMDPQQRVLLETSW 
AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 



Nl - Avr\\\ MN2 - Nhel 







AT 
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EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 7 00 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
5 TDGFGATGSQTSVLSG 

GGCTGTCGTACTTCT ACGGTCTGGAGGGTCCGGCGGTC ACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
ACSSSLVALHQAGQSLR 
10 CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
15 GRAKAFGAGADGTS FAE 

GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

GAGVLIVERLS DAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
20 GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
l*t ASNGLSAPNGPSQERVI 
' OJ CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RJ RQALANAGLT PAD V DA 

yi TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 

ffl: 25 VEAHGTGTRLGDPIEAQ 

pi l GCGGT ACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

ll AVLATYGQERAT PLLLG 

CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
L SLKSNIGHAQAASGVA 
Hi 30 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 1400 

± G I IKMV-QALRHGELPPT 

U CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LU LHADEPSPHVDWTAG AV 

□ CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 

35 ELLTSARPWPETDRPR 

GGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACCAACGCCCACGTCATC 1550 
RAGVSSFGISGTNAHVI 
CTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAACGCGGTGATCGAGCG 1600 
LESAPPTQPADNAVIER 
40 GGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCAGGACCCAGTCGGCTT 1650 
APEWVPLVISARTQSA 
TGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTGGCGGCGTCGCCCGGG 1700 
LTEHEGRLRAYLAAS PG 
GTGGATATGCGGGCTGTGGCATCGACGCTGGCGATGACACGGTCGGTGTT 1750 
45 VDMRAVASTLAMTRSVF 

CGAGCACCGTGCCGTGCTGCTGGGAGATGACACCGTCACCGGCACCGCTG 1800 

EHR AVLLGDDTVTGTA 
TGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGACAGGGGTCGCAGCGT 1850 
VSDPRAVFVFPGQGSQR 
50 GCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCCCGTCTTCGCGCGGAT 1900 
AGMGEELAAAFPVFARI 
CCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACG* 1950 

HQQVWDLLDVPDLEVN 
AGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTC 2000 
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ETGY AQPAL FAMQVALF 
GGGCTGCTGGAATCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTC 2050 

GLLESWGVRPDAVIGHS 
GGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGG 2100 
5 VGELAAAYVSGVWSLE 

ATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCC 2150 
DACTLVSARARLMQALP 
GCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGAGGATGAGGCCCGGGC 2200 
AGGVMVAVPVS EDEARA 
1 0 CGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGG 2250 
VLGEGVEIAAVNGPSS 
TGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTG 2300 
VVLSGDEAAVLQAAEGL 
GGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTAT 2350 
15 GKWTRLATSHAFHSARM 

GGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACC 2400 

EPMLEEFRAVAEGLTY 
GGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAG 2450 
ri RTPQVSMAVGDQVTTAE 

20 TACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGC 2500 
YWVRQVRDTVR FGEQVA 
0* CTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGG 2550 

F!J SYEDAVFVELGADRSL 
Ul CCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAG 2 600 



25 ARLVDG VAMLHGDHEIQ 



GCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGA 2650 

AAIGALAHLYVNGVTVD 
CTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTC 2700 
L WPALLGDAPATRVLDL 
l :f 30 CGACATACGCCTTCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCG 27 50 

*t PTYAFQHQRYWLESARP 
£3 GCCGCATCCGACGCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGC 2800 

W AASDAGHPVLGSGIALA 
r| CGGGTCGCCGGGCCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACC 2850 

.U 35 GSPGRVFTGSVPTGAD 

GCGCGGTGTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGAC 2900 
RAVFVAELALAAADAVD 
TGCGCCACGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGG 2950 
CATVERLDIASVPGRPG 
40 CCATGGCCGGACGACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACG 3000 
HGRTTVQTWVDEPADD 
GCCGGCGCCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACG 3050 
GRRRFTVHTRTGD APWT 
CTGCACGCCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGC 3100 
45 LHAEGVLRPHGTALPDA 

GGCCGACGCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGC 3150 

ADAEWPPPGAVPADGL 
CGGGTGTGTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGAC 3200 
PGVWRRGDQVFAEAEVD 
50 GGACCGGACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTC 3250 
GPDGFVVHPDLLDAVFS 
CGCGGTCGGCG ACGGAAGCCGCC AGCCGGCCGGATGGCGCGACCTGACGG 3300 

AVGDGSRQPAGWRDLT 
TGCACGCGTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACC 3350 
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VHAS DATVLRACLTRRT 
GACGGAGCCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACT 3400 

DGAMG FAAFDGAGLPVL 
CACCGCGGAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCG 3450 
5 TAEAVTLREVAS PSGS 

AGGAGTCGGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCG 3500 
EESDGLHRLEWLAVAEA 
GTCTACGACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCA 3550 
VYDGDL PEGHVLITAAH 
1 0 CCCCGACGACCCCGAGGACATACCCACCCGCGCCCACACCCGCGCCACCC 3600 
PDDPEDI PTRAHTRAT 
GCGTCCTGACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTC 3650 
RVLTALQHHLTTTDHTL 
ATCGTCCACACCACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCAC 3700 
15 IVHTTTDPAGATVTGLT 

CCGCACCGCCC AGAACGAACACCCCCACCGCATCCGCCTCATCG AAACCG 3750 

RTAQNEHPHRIRLIET 
ACCACCCCCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCAC 3800 
DHPHTPLPLAQLA TLDH 
□ 20 CCCCACCTCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCC 3850 

4} PHLRLTHHTLHHPHLTP 
[0 CCTCCACACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACG 3900 

LHTTTPPTTTPLNPEH 
CCATCATCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGC 3950 
25 AI I ITGGSGTLAGILAR 
p[! CACCTGAACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGA 4000 

j- 1 HLNHPHTYLLSRTPPPD 
H ! CGCCACCCCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAAC 4050 

P ATPGTHLPCDVGDPHQ 
p 30 TCGCCACCACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCAC 4100 

S C LATTLTHIPQPLTAIFH 

ACCGCCGCCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCG 4150 

T AATLDDGILHALTPDR 
CCTCACCACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACC 4 200 
35 LTTVLH PKANAAWHLH 

ACCTCACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCC 4 250 
HLTQNQPLTHFVLYSSA 
GCCGCCGTCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGC 4 300 
AAVLGS PGQGNYAAANA 
40 CTTCCTCGACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCA 4 350 
FLDALATHRHTLGQPA 
CCTCCATCGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAA 4 400 
TSIAWGMWHTTSTLTGQ 
CTCGACGACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGAT 4 4 50 
45 LDDADRDRIRRGGFLPI 
C AC GG AC G AC G AGG G CAT G G GG AT GC AT 
T D D E G 



m 



w 
□ 



The Avrll-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
50 with the endogenous AT domain replaced by the AT domain of module 1 3 (specific for 
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methylmalonyl Co A) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 

( AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 
QLAEALLTLVREST 
5 GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGI DSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
10 ALTEATGVRLNATAVFD 

TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 

FPTPHVLAGKLGDEL TG 
CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
1 5 ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
DEPLAIVGMACRLPGGV 
GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 400 
□ AS PEELWHLVASGTDAI 

*U CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 4 50 

pi! 20 TEFPT.DRGWDVDAIYD 

fy CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 

m PD.PDAIGKTF.VRHGGFL 
%l ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

PlJ TGATGFDAAFFGI SPRE 

; y 25 GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

* th ALAMDPQQRVLLETSW 

AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
U EAFESAGITPDSTRGSD 
J£ ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 7 00 • 

O 30 TGVFVGAFSYGYGTGAD 

CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 7 50 
pi TDGFGATGSQTSVLSG 

GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG' 800 
RLSYFYGLEGPAVTVDT 
3 5 GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
ACSSSLVALHQAGQSLR 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECSLALVGGVTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
40 SPGGFVEFSRQRGLAPD 

GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGG ACGGC ACGAGCTTCGCCGA 1000 

GRAKAFGAGADGTS FAE 
GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVLIVERLS DAERN 
45 GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
G HTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
50 RQALANAGLT PADVDA 

TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
VEAHGTGTRLGDPI EAQ 
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GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 
SLKSN IGHAQAASGVA 
5 GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
GI I KMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 

LHADE PS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCTAGGC 1500 
10 ELLTSARPWPETDRPR 

GGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACCAACGCCCACGTCATC 1550 
RAGVSS FGVSGTNAHVI 
CTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGAGGCGCAGCCTGTTGA 1600 
LESAP PAQPAEEAQPVE' 
15 GACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGGTGATATCGGCCAAGA 1650 
TPVVASDVLPLVISAK 
CCCAGCCCGCCCTGACCGAACACGAAGACCGGCTGCGCGCCTACCTGGCG 1700 
TQPALTEHEDRLRAYLA 
GCGTCGCCCGGGGCGGATATACGGGCTGTGGCATCGACGCTGGCGGTGAC 17 50 
20 ASPGADIRAVASTLAVT 

ACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTGGAGATGACACCGTCA 1800 

RSVFEHRAVLLGDDTV 
CCGGCACCGCGGTGACCGACCCCAGGATCGTGTTTGTCTTTCCCGGGCAG 1850 
TGTAVTDPRIV FVFPGQ 
25 ' GGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCGCGATTCGTCGGTGGT 1900 
GWQWLGMG SALRD SSVV 
GTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGTTGCGCGAGTTCGTGG 1950 

FAERMAECAAALREFV 
ACTGGGATCTGTTCACGGTTCTGGATGATCCGGCGGTGGTGGACCGGGTT 2000 
30 DWDLFTVLDDPAVVDRV 

GATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGTTTCCCTGGCCGCGGT 2050 

DVVQPASWAMMVSLAAV 
GTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGATCGGCCATTCGCAGG 2100 
WQAAGVRPDAVIGHSQ 
35 GTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTGTCACTACGCGATGCC 2150 
GEIAAACVAGAVSLRDA 
GCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGCCCGGGGCCTGGCGGG 2200 

ARIVTLRSQAIARGLAG 
CCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCGCAGGATGTCGAGCTGG 2250 
40 RGAMASVALPAQDVEL 

TCGACGGGGCCTGGATCGCCGCCCACAACGGGCCCGCCTCCACCGTGATC 2300 
VDGAWIAAHNGPASTVI 
GCGGGCACCCCGGAAGCGGTCGACCATGTCCTCACCGCTCATGAGGCACA 2350 
AGTPEAVDHVLTAHEAQ 
45 AGGGGTGCGGGTGCGGCGGATCACCGTCGACTATGCCTCGCACACCCCGC 2400 
GVRVRRITVDYASHTP 
ACGTCGAGCTGATCCGCGACGAACTACTCGACATCACTAGCGACAGCAGC 2450 
HVELIRDELLDITSDSS 
TCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGTGGACGGCACCTGGGT 2500 
50 SQTPLVPWLSTVDGTWV 

CGACAGCCCGCTGGACGGGGAGTACTGGTACCGGAACCTGCGTGAACCGG 2550 

DSPLDGEYWYRNLREP 
TCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCCCAGGGCGACACCGTG 2600 
VGFHPAVSQLQAQGDTV 
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ttcgtcgaggtcagcgccagcccggtgttgttgcaggcgatggacgacga 2 650 

fvevsaspvllqamddd 
tgtcgtcacggttgccacgctgcgtcgtgacgacggcgacgccacccgga 2700 
vvtv'atlrr ddgdatr 
5 tgctcaccgccctggcacaggcctatgtccacggcgtcaccgtcgactgg 27 50 
mltalaqayvh. gvtvdw 
cccgccatcctcggcaccaccacaacccgggtactggaccttccgaccta 2800 

pailgttttrvldlpty 
cgccttccaacaccagcggtactggctcgagtcggcacgcc'cggccgcat 2850 
10 afqhqrywlesarpaa' 

ccgacgcgggccaccccgtgctgggctccggtatcgccctcgccgggtcg 2900 
sdaghpvlgsgialags 
ccgggccgggtgttcacgggttccgtgccgaccggtgcggaccgcgcggt 2950 

PGRVFTGS VPTGADRAV 
15 GTTCGTCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCA 3000 
FVAELALAAADAVDCA 
CGGTCGAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGC 3050 
TVERLDIASVPGRPGHG 
CGGACGACCGT ACAGACCTGGGTCGACG AGCCGGCGGACGACGGCCGGCG 3100 
20 RTTVQTWVDEPADDGRR 

CCGGTTCACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACG 3150 

RFTVHTRTGDAPWTLH 
CCGAGGGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGAC 3200 
AEGV L R P HGTAL P D A A D 
25 GCCGAGTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGT 3250 
AEWP PPGAVPA.DGLPGV 
GTGGCGCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGG 3300 

WRRGDQVFAEAEVDGP 
ACGGTTTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTC 3350 
30 DGFVVHPDLLDAVFSAV 

GGCGACGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGC 34 00 

GDGSRQPAGWRDLTVHA 
GTCGGACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAG 34 50 
S DAT V L RAC L T R RT DG 
35 CCATGGGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCG 3500 
AMGFAAFDGA .GLPVLTA 
GAGGCGGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTC 3550 

EAVTLREVASPSGSEES 
GGACGGCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACG 3600 
40 DGLHRLEWLAVAEAVY 

ACGGTGACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGAC 3650 
DGDLPEGHVLITAAHPD 
G ACCCCGAGGACAT ACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCT 3700 
DPEDI PTRAHTRATRVL 
45 GACCGCCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCC 3750 
TALQHHLTTTDHTLIV 
ACACCACCACCG ACCCCGCCGGCGCCACCGTCACCGGCCTC ACCCGC ACC 3800 
HTTTDPAGATVTGLTRT 
GCCCAGAACGAACACCCCC ACCGC ATCCGCCTCATCGAAACCGACCACCC 3850 
50 AQNEHPHRIRLIETDHP 

CCACACCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACC 3900 

HTPLPLAQLATLDHPH 
TCCGCCTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCAC 3950 
LRLTHHTLHHPHLTPLH 
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ACCACCACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCAT 4 000 

TTTPPTTTPLNPEHAII' 
CATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGA 4 050 
ITGGSGTLAG I LARHL 
5 ACCACCCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACC 4100 
NHPHTYLLSRTPPPDAT 
CCCGGCACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCAC 4150 

PGTHLPCDVGDPHQLAT 
CACCCTCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCG 4200 
10 TLTHIPQPLTAI FHTA 

CCACCCTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACC 4 250 
ATLD DGI LHALTPDRLT 
ACCGTCCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCAC 4300 
TV LH PKANAAWHLHHLT 
1 5 CCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCG 4 350 
QNQPLTHFVLYSSAAA 
TCCTCGGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTC 4 400 
VLGS PGQGNYAAANAFL 
GACGCCCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCAT 4 450 
20 DALATHRHTLGQPATS I 

CGCCTGGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACG 4 500 

AWGMWHTTSTLTGQLD 
ACGCCGACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGAC 4 550 
DADRDRI RRGGFLP ITD 
25 GACGAGGGCATGGGGATGCAT 
D E G 



The Nhell-Xhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by the AT domain of module 12 (specific for 
30 malonyl Co A) of the rapamycin PKS has the DNA sequence and encodes the amino acid 
sequence shown below. 

AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 50 

QLAEALLTLVREST 
GCCGCCGTGCTCGGCC ACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 
35 AAVLGHVGGEDI PATAA 

GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 

FKDLGI DSLTAVQLRN 
CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRL NATAVFD 
40 TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPT PHVLAGKLGDELTG 
C ACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 

TRAPVVPRTAATAGAH 
ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 
45 DEPLAIVGMACRLPGGV 

GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 4 00 

AS PEEL WHLVASGTDAI 
CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 
TEFPTDRGWDVDAIYD 
50 CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
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PDPDAIGKTFVRHGGFL 
ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 

TGATGFDAAFFGI S PRE 
GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 
5 ALAMD PQQRVLLETSW 

AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 
EAFESAGITPDSTRGSD 
ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 
TGVFVGAFSYGYGTGAD 
10 CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 750 
TDGFGATGSQTSVLSG 
GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 
15 ACSSSLVALHQAGQSLR 

CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 

SGECSLALVGG VTVMA 
CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
flS% SPGGFVEFSRQRGLAPD 
^ 20 GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 

GRAKAFGAGAD GT S FAE 
CO GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 

ril GAGVLIVERLSDAERN 
LPs GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 

ff| 25 GHTVLAVVRGSAVN QDG 

fi i GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 

ASNGLSAPNGPSQERVI 
CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 
RQALANAGLT PADVDA 
30 TCGAGGCCCACGGCACCGGCACCAGGCTGGGCGACCCCATCGAGGCACAG 1250 
4 ; VEAHGTGTRLGDPI EAQ 

B GCGGTACTGGCCACCTACGGACAGGAGCGCGCCACCCCCCTGCTGCTGGG 1300 

AVLATYGQERATPLLLG 
p CTCGCTGAAGTCCAACATCGGCCACGGCCAGGCCGCGTCCGGCGTCGCCG 1350 

U 35 SLKSNIGHAQAASGVA 

GC ATC ATCAAGATGGTGCAGGCCCTCCGGCACGGGG AGCTGCCGCCGACG 1400 
GI I KMVQALRHGELPPT 
CTGCACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 14 50 
LHADEPS PHVDWTAGAV 
40 CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETD RPR 
GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAVSS FGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 
45 LEAGPVTETPAAS PSGD 

CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 

LPLLVSARS PEALDEQ 
TCCGCCG ACTGCGCGCCT ACCTGG ACACCACCCCGG ACGTCGACCGGGTG 1700 
IRRLRAYLDTTPDVDRV 
50 GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQTLARRTHFAHRAV 
GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 

LLGDTVITTPPADRPD 
AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 
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ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGT 1900 

EQLAAAFPVFARI HQQV 
GTGGGACCTGCTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACG 1950 
5 WDLLDVPDLEVNETGY 

CCCAGCCGGCCCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAA 2000 
AQPAL FAMQVALFGLLE 
TCGTGGGGTGTACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCT 2050 
SWGVRPDAVIGHSVGEL 
10 TGCGGCTGCGTATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTT 2100 
AAAYVSGVWSLEDACT 
TGGTGTCGGCGCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTG 2150 
LVSARARLMQALPAGGV 
ATGGTCGpTGTCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGA 2200 
15 MVAVPVSEDEARAVLGE 

GGGTGTGGAGATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCG 2250 

GVEIAAVNGPSSVVLS 
GTGATGAGGCCGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACG 2300 
GDEAAVLQAAEGLGKWT 
P 20 CGGCTGGCGACCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCT 2350 

%U RLATSHAFHSARMEPML 

GGAGGAGTTCCGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGG 24 00 
f|| EEFRAVAEGLTYRTPQ 
Ilk TCTCCATGGCCGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGG 24 50 

%i 25 V S M.AV G D Q VT T AE Y W V.R 

«jj CAGGTCCGGGACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGA 2500 

j*f QVRDTVRFGEQVASYED 

CGCCGTGTTCGTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCG 2550 
s AVFVELGADRSLARLV 
□ 30 ACGGTGTCGCGATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGC 2600 

4* DGVAMLHGDHE.IQAAIG 
p GCCCTGGCCCACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCT 2 650 

ALAHLYVNGVTVD'WPAL 
CCTGGGCGATGCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCT 2700 
35 LGDAPATRVLDLPTYA 

TCCAGCACCAGCGCTACTGGCTCGAGTCGGCACGCCCGGCCGCATCCGAC 2750 
FQHQRYWLESARPAASD 
GCGGGCCACCCCGTGCTGGGCTCCGGTATCGCCCTCGCCGGGTCGCCGGG 2800 
AGHPVLGSGIALAGSPG 
40 CCGGGTGTTCACGGGTTCCGTGCCGACCGGTGCGGACCGCGCGGTGTTCG 2850 
RVFTGSVPTGADRAVF 
TCGCCGAGCTGGCGCTGGCCGCCGCGGACGCGGTCGACTGCGCCACGGTC 2900 
VA E L A LA A A DAV D C AT V 
GAGCGGCTCGACATCGCCTCCGTGCCCGGCCGGCCGGGCCATGGCCGGAC 2950 
45 ERLDIASVPGRPGHGRT 

GACCGTACAGACCTGGGTCGACGAGCCGGCGGACGACGGCCGGCGCCGGT 3000 

TVQTWVDEP ADDGRRR 
TC ACCGTGCACACCCGCACCGGCGACGCCCCGTGGACGCTGCACGCCGAG 3050 
FTVHTRTGDAPWTLHAE 
50 GGGGTGCTGCGCCCCCATGGCACGGCCCTGCCCGATGCGGCCGACGCCGA 3100 
GVLRPHGTALPDAADAE 
GTGGCCCCCACCGGGCGCGGTGCCCGCGGACGGGCTGCCGGGTGTGTGGC 3150 

WPPPGAVPADGLPGVW 
GCCGGGGGGACCAGGTCTTCGCCGAGGCCGAGGTGGACGGACCGGACGGT 3200 
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RRGDQVFAEAEVDGPDG 
TTCGTGGTGCACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGA 3250 

FVVHPDLLDAVFSAVGD 
CGGAAGCCGCCAGCCGGCCGGATGGCGCGACCTGACGGTGCACGCGTCGG 3300 
5 GSRQPAGWRDLTVHAS 

ACGCCACCGTACTGCGCGCCTGCCTCACCCGGCGCACCGACGGAGCCATG 3350 
DATVLRACLTRRT DG.AM 
GGATTCGCCGCCTTCGACGGCGCCGGCCTGCCGGTACTCACCGCGGAGGC 34 00 
GFAAFDGAGLPVLTAEA 
1 0 GGTGACGCTGCGGGAGGTGGCGTCACCGTCCGGCTCCGAGGAGTCGGACG 34 50 
VTLREVASPSGSEESD 
GCCTGCACCGGTTGGAGTGGCTCGCGGTCGCCGAGGCGGTCTACGACGGT 3500 
GLHRLEWLAVAE AVYDG 
GACCTGCCCGAGGGACATGTCCTGATCACCGCCGCCCACCCCGACGACCC 3550 
15 DLPEGHVLITAAHPDDP 

CGAGGACATACCCACCCGCGCCCACACCCGCGCCACCCGCGTCCTGACCG 3600 

EDI PTRAHTRAT RVLT 
CCCTGCAACACCACCTCACCACCACCGACCACACCCTCATCGTCCACACC 3650 
ALQHHLTTTDHTLIVHT 
p 20 ACCACCGACCCCGCCGGCGCCACCGTCACCGGCCTCACCCGCACCGCCCA 37 00 

In TTDPAGATVTGLTRTAQ 
0% GAACGAACACCCCCACCGCATCCGCCTCATCGAAACCGACCACCCCCACA 3750 

N E H P H R I R L I E T D H P H 
1% CCCCCCTCCCCCTGGCCCAACTCGCCACCCTCGACCACCCCCACCTCCGC 3800 

*f£ 25 T P L P L A Q' L A T L D H PHLR 

1 \ CTCACCCACCACACCCTCCACCACCCCCACCTCACCCCCCTCCACACCAC 3850 

FU LTHHTLHHPHLT P LHTT 

H : CACCCCACCCACCACCACCCCCCTCAACCCCGAACACGCCATCATCATCA 3900 

TPPTTTPLNPEHAI I I 
p 30 CCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCACCTGAACCAC 3950 

J* TGGSGTLAGI LARHLNH 

CCCCACACCTACCTCCTCTCCCGCACCCCACCCCCCGACGCCACCCCCGG 4000 
t% PHTYLLSRTPPPDATPG 

CACCCACCTCCCCTGCGACGTCGGCGACCCCCACCAACTCGCCACCACCC 4050 
■»f 35 THLPCDVGDPHQLATT 

^ TCACCCACATCCCCCAACCCCTCACCGCCATCTTCCACACCGCCGCCACC 4100 

LTHIPQPLTAI FHTAAT 
CTCGACGACGGCATCCTCCACGCCCTCACCCCCGACCGCCTCACCACCGT 4150 
LDDGILHALTPDRLTTV 
40 CCTCCACCCCAAAGCCAACGCCGCCTGGCACCTGCACCACCTCACCCAAA 4200 
LHPKANAAWHLHHLTQ 
ACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCCGCCGTCCTC 4250 
NQPLTHFVLYS SAAAVL 
GGCAGCCCCGGACAAGGAAACTACGCCGCCGCCAACGCCTTCCTCGACGC 4300 
45 GSPGQ GNYAAANAFL DA 

CCTCGCCACCCACCGCCACACCCTCGGCCAACCCGCCACCTCCATCGCCT 4 350 

LATHRHTLGQPATSIA 
GGGGCATGTGGCACACCACCAGCACCCTCACCGGACAACTCGACGACGCC 4 4 00 
WGMWHTTSTLTGQLDDA 
50 GACCGGGACCGCATCCGCCGCGGCGGTTTCCTCCCGATCACGGACGACGA 4 4 50 
DRDRIRRGGFLPITDDE 
GGGCATGGGGATGCAT 
G 
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The NheWXhol restriction fragment that encodes module 8 of the FK-520 PKS 
with the endogenous AT domain replaced by 'the AT domain of module 13 (specific for 
methylmalonyl CoA) of the rapamycin PKS has the DNA sequence and encodes the 
amino acid sequence shown below. 



5 AGATCTGGCAGCTCGCCGAAGCGCTGCTGACGCTCGTCCGGGAGAGCACC 5 0 
QLAEALLTLVREST 
GCCGCCGTGCTCGGCCACGTGGGTGGCGAGGACATCCCCGCGACGGCGGC 100 

AAVLGHVGGEDI PATAA 
GTTCAAGGACCTCGGCATCGACTCGCTCACCGCGGTCCAGCTGCGCAACG 150 
10 FKDLG I DSLTAVQLRN 

CCCTCACCGAGGCGACCGGTGTGCGGCTGAACGCCACGGCGGTCTTCGAC 200 
ALTEATGVRLNATAVF D 
TTCCCGACCCCGCACGTGCTCGCCGGGAAGCTCGGCGACGAACTGACCGG 250 
FPT PHVLAGKLGDELTG 
15 CACCCGCGCGCCCGTCGTGCCCCGGACCGCGGCCACGGCCGGTGCGCACG 300 
TRAPVVPRTAATAGAH 
O ACGAGCCGCTGGCGATCGTGGGAATGGCCTGCCGGCTGCCCGGCGGGGTC 350 

Hp DEPLAI VGMACRLPGGV 

0.| GCGTCACCCGAGGAGCTGTGGCACCTCGTGGCATCCGGCACCGACGCCAT 400 

fjj 20 AS PEELWHLVASGTDAI 

CACGGAGTTCCCGACGGACCGCGGCTGGGACGTCGACGCGATCTACGACC 450 

T E F P " T DRGWDVDAI Y D 
CGGACCCCGACGCGATCGGCAAGACCTTCGTCCGGCACGGTGGCTTCCTC 500 
PDPDAI GKTFVRHGGFL 
25 ACCGGCGCGACAGGCTTCGACGCGGCGTTCTTCGGCATCAGCCCGCGCGA 550 
* TGATG FDAAFFG I S PRE 

□ GGCCCTCGCGATGGACCCGCAGCAGCGGGTGCTCCTGGAGACGTCGTGGG 600 

B p" ALAMD PQQRVLLETSW 

p AGGCGTTCGAAAGCGCCGGCATCACCCCGGACTCGACCCGCGGCAGCGAC 650 

|J| 30 EAFESAGITPDSTRGSD 

t*\ ACCGGCGTGTTCGTCGGCGCCTTCTCCTACGGTTACGGCACCGGTGCGGA 700 

TGVFVGAFSYGYGTGAD 
CACCGACGGCTTCGGCGCGACCGGCTCGCAGACCAGTGTGCTCTCCGGCC 750 
TDGFGATGSQTSVLSG 
35 GGCTGTCGTACTTCTACGGTCTGGAGGGTCCGGCGGTCACGGTCGACACG 800 
RLSYFYGLEGPAVTVDT 
GCGTGTTCGTCGTCGCTGGTGGCGCTGCACCAGGCCGGGCAGTCGCTGCG 850 

ACSSS LVALHQAGQS L R 
CTCCGGCGAATGCTCGCTCGCCCTGGTCGGCGGCGTCACGGTGATGGCGT 900 
40 SGEC S LALVGGVTVMA 

CTCCCGGCGGCTTCGTGGAGTTCTCCCGGCAGCGCGGCCTCGCGCCGGAC 950 
SPGGFVEFSRQRGLAPD 
GGCCGGGCGAAGGCGTTCGGCGCGGGTGCGGACGGCACGAGCTTCGCCGA 1000 
GRAKAFGAGADGTS FAE 
45 GGGTGCCGGTGTGCTGATCGTCGAGAGGCTCTCCGACGCCGAACGCAACG 1050 
GAGVLIVERLSDAERN 
GTCACACCGTCCTGGCGGTCGTCCGTGGTTCGGCGGTCAACCAGGATGGT 1100 
GHTVLAVVRGSAVNQDG 
GCCTCCAACGGGCTGTCGGCGCCGAACGGGCCGTCGCAGGAGCGGGTGAT 1150 
50 ASNGLSAPNGPSQERVI 
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CCGGCAGGCCCTGGCCAACGCCGGGCTCACCCCGGCGGACGTGGACGCCG 1200 

RQALANAGLTPADVDA 
TCG AGGCCCACGGCACCGGCACCAGGCTGGGCGACCCC ATCGAGGC AC AG 1250 
VEAHGTGTRLGDPI EAQ 
5 GCGGTACTGGCCACCTACGGACAGGAGCGCGCC ACCCCCCTGCTGCTGGG 1300 
AVLAT YGQERAT PLLLG 
CTCGCTGAAGTCCAACATCGGCCACGCCCAGGCCGCGTCCGGCGTCGCCG 1350 

SLKSNIGHAQAASGVA 
GCATCATCAAGATGGTGCAGGCCCTCCGGCACGGGGAGCTGCCGCCGACG 14 00 
10 G I I KMVQ-ALRHGELPPT 

CTGC ACGCCGACGAGCCGTCGCCGCACGTCGACTGGACGGCCGGCGCCGT 1450 

LHADE PS PHVDWTAGAV 
CGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGCCACGGC 1500 
ELLTSARPWPETDRPR 
1 5 GTGCCGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATC 1550 
RAAVS SFGVSGTNAHVI 
CTGGAGGCCGGACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGA 1600 

LEAGPVTET PAAS PSGD 
CCTTCCCCTGCTGGTGTCGGCACGCTCACCGGAAGCGCTCGACGAGCAGA 1650 
20 LPLLVSARSPEALDEQ 

TCCGCCGACTGCGCGCCTACCTGGACACCACCCCGGACGTCGACCGGGTG 1700 
I RRLRAYLDTTP DVDRV 
GCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCCACCGCGCCGT 1750 
AVAQTLARRTH FAHRAV 
25 GCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 1800 
£M LLGDTVI. TTPP ADRPD 

Rl AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGC 1850 

ELVFVYSGQGTQHPAMG 
GAGCAGCTAGCCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTG 1900 
P 30 EQLADSSVVFAERMAE'C 

j* TGCGGCGGCGTTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGG 1950 

AAALRE FVD WDL FTVL 
ATGATCCGGCGGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGG 2000 
DDPAVVDRVDVVQPASW 
35 GCGATGATGGTTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCC 2050 
H' AMMVS LA AVWQAAGVRP 

GGATGCGGTGATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGG 2100 

DAVI GHSQGEIAAACV 
CGGGTGCGGTGTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGC 2150 
40 AGAVSLRDAARI VTLRS 

CAGGCGATCGCCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGC 2200 

QAIARGLAGRGAMASVA 
CCTGCCCGCGCAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCC 2250 
LPAQDVELVDGAWIAA 
45 ACAACGGGCCCGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGAC 2300 
HNGPASTVIAGT PEAVD 
CATGTCCTCACCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCAC 2350 

HVLTAHEAQGVRVRRIT 
CGTCGACTATGCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAAC 24 00 
50 VDYASHTPHVELIRDE 

TACTCGACATCACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGG 2450 
LLDITSDSSSQTPLVPW 
CTGTCGACCGTGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTA 2500 
LSTVDGTWVDSPLDGEY 
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CTGGTACCGGAACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCC 2550 

WYRNLREPVGFHPAVS 
AGTTGCAGGCCCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCG 2 600 
QLQAQGDTVFVEVSASP 
5 GTGTTGTTGCAGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCG 2650 
VLLQAMDDDVVTVATLR 
TCGTGACGACGGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCT 2700 

RDDGDATRMLTALAQA 
ATGTCCACGGCGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACA 2750 
10 YVHGVTVDWPAI LGTTT 

ACCCGGGTACTGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTG 2800 

TRVLDLPTYAFQHQRYW 
GCTCGAGTCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGG 2850 
LESARPAASDAGHPVL 
15 GCTCCGGTATCGCCCTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTTCC 2900 
GSGIALAGSPGRVFTGS 
GTGCCGACCGGTGCGGACCGCGCGGTGTTCGTCGCCGAGCTGGCGCTGGC 2950 

VPTGADRAVFVAELALA 
CGCCGCGGACGCGGTCGACTGCGCCACGGTCGAGCGGCTCGACATCGCCT 3000 
20 AADAVDCATVERLDIA 

CCGTGCCCGGCCGGCCGGGCCATGGCCGGACGACCGTACAGACCTGGGTC 3050 
SVPGRPGHGRTTVQTWV 
GACGAGCCGGCGGACGACGGCCGGCGCCGGTTCACCGTGCACACCCGCAC 3100 
jjf DEPADDGRRRFTVHTRT 
^ 25 CGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTGCT'GCGCCCCCATG 3150 

GDAPWTLHAE GVLRPH 
GCACGGCCCTGCCCGATGCGGCCGACGCCGAGTGGCCCCCACCGGGCGCG 3200 
GTAL P DAADAEW P P PGA- 
GTGCCCGCGGACGGGCTGCCGGGTGTGTGGCGCCGGGGGGACCAGGTCTT 3250 
f;5 30 VPADGLPGVWRRGDQVF 

CGCCGAGGCCGAGGTGGACGGACCGGACGGTTTCGTGGTGCACCCCGACC 3300 

AEAEVDGPDGFVVHPD 
TGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGAAGCCGCCAGCCGGCC 3350 
LLDAVFSAVGDGSRQPA 
P " 35 GGATGGCGCGACCTGACGGTGCACGCGTCGGACGCCACCGTACTGCGCGC 3400 

h* GWRDLTVHAS DATVLRA 

CTGCCTCACCCGGCGCACCGACGGAGCCATGGGATTCGCCGCCTTCGACG 3450 

CLTRRTDGAMGFAAFD 
GCGCCGGCCTGCCGGTACTCACCGCGGAGGCGGTGACGCTGCGGGAGGTG 3500 
40 GAGLPVLTAEAVTLREV 

GCGTCACCGTCCGGCTCCGAGGAGTCGGACGGCCTGCACCGGTTGGAGTG 3550 

ASPSGSEESDGLHRLEW 
GCTCGCGGTCGCCGAGGCGGTCTACGACGGTGACCTGCCCGAGGGACATG 3600 
LAVAEAVYDGDLPEGH 
45 TCCTGATCACCGCCGCCCACCCCGACGACCCCGAGGACATACCCACCCGC 3650 
VLITAAHPDDPEDIPTR 
GCCCACACCCGCGCCACCCGCGTCCTGACCGCCCTGCAACACCACCTCAC 3700 

AHTRATRVLTALQHHLT 
CACCACCGACCACACCCTCATCGTCCACACCACCACCGACCCCGCCGGCG 3750 
50 TTDHTL1VHTTTDPAG 

CCACCGTCACCGGCCTCACCCGCACCGCCCAGAACGAACACCCCCACCGC 3800 
ATVTGLTRTAQNEHPHR 
ATCCGCCTCATCGAAACCGACCACCCCCACACCCCCCTCCCCCTGGCCCA 3850 
IRLIETDHPHTPLPLAQ 
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ACTCGCCACCCTCGACCACCCCCACCTCCGCCTCACCCACCACACCCTCC 3900 

LATLDHPHLRLTHHTL 
ACCACCCCCACCTCACCCCCCTCCACACCACCACCCCACCCACCACCACC 3950 
HH PHLTPLHTTTPPTTT 
5 CCCCTCAACCCCGAACACGCCATCATCATCACCGGCGGCTCCGGCACCCT 4 000 
PLNPEHAI IITGGSGTL 
CGCCGGCATCCTCGCCCGCCACCTGAACCACCCCCACACCTACCTCCTCT 4050 

AG I LARH LNH P HT YL L 
CCCGCACCCCACCCCCCGACGCCACCCCCGGCACCCACCTCCCCTGCGAC 4100 
10 SRTPPPDATPGTHLPCD 

GTCGGCGACCCCCACCAACTCGCCACCACCCTCACCCACATCCCCCAACC 4150 

VGDPHQLATTLTH I PQP 
CCTCACCGCCATCTTCCACACCGCCGCCACCCTCGACGACGGCATCCTCC 4200 
LTAIFHTAATLDDGIL 
1 5 ACGCCCTCACCCCCGACCGCCTCACCACCGTCCTCCACCCCAAAGCCAAC 4250 
HALTPDRLTTVLH PKAN 
GCCGCCTGGCACCTGCACCACCTCACCCAAAACCAACCCCTCACCCACTT 4 300 

AAWHLHHLTQNQPLTHF 
CGTCCTCTACTCCAGCGCCGCCGCCGTCCTCGGCAGCCCCGGACAAGGAA 4 350 
20 VLYS SAAAVLGS PGQG 

ACTACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCAC 4 4 00 
NYAAANAFLDALATHRH 
ACCCTCGGCCAACCCGCCACCTCCATCGCCTGGGGCATGTGGCACACCAC 4 4 50 
\]i TLGQPATSIAWGMWHTT* 
^ 25 CAGCACCCTCACCGGACAACTCGACGACGCCGACCGGGACCGCATCCGCC 4500 

SIS. 

U» STLTGQLDDADRDRIR 

fU GCGGCGGTTTCCTCCCGATCACGGACGACGAGGGCATGGGGATGCAT 

M : RGGFLPI TDDEG 



p 30 Phage KC5 1 5 DNA was prepared using the procedure described in Genetic 

p% Manipulation of Streptomyces, A Laboratory Manual, edited by D. Hopwood et al A 

phage suspension prepared from 10 plates (100 mm) of confluent plaques of KC515 on S. 
M lividans TK24 generally gave about 3 \xg of phage DNA. The DNA was ligated to 

circularize at the cos site, subsequently digested with restriction enzymes BamYil and 
35 Pstl, and dephosphorylated with SAP. 

Each module 8 cassette described above was excised with restriction enzymes 
BgHl and Nsil and ligated into the compatible BamHl and Pstl sites of KC515 phage 
DNA prepared as described above. The ligation mixture containing KC515 and various 
cassettes was transfected into protoplasts of Streptomyces lividans TK24 using the 
40 procedure described in Genetic Manipulation of Streptomyces, A Laboratory Manual 
edited by D. Hopwood et al and overlaid with TK24 spores. After 16-24 hr, the plaques 
were restreaked on plates overlaid with TK24 spores. Single plaques were picked and 
resuspended in 200 |aL of nutrient broth. Phage DNA was prepared by the boiling method 



dc- 176500 



II PATENT 

AttyDkt: 300622002600 



- 104 - 



(Hopwood et al, supra). The PCR with primers spanning the left and right boundaries of 
the recombinant phage was used to verify the correct phage had been isolated. In most 
cases, at least 80% of the plaques contained the expected insert. To confirm the presence 
of the resistance marker (thiostrepton), a spot test is used, as described in Lomovskaya et 
al. (1997), in which a plate with spots of phage is overlaid with mixture of spores of 
TK24 and phiC31 TK24 lysogen. After overnight incubation, the plate is overlaid with 
antibiotic in soft agar. A working stock is made of all phage containing desired 
constructs. 

Streptomyces hygroscopicus ATCC 14891 (see US Patent No. 3,244,592, issued 
5 Apr 1966, incorporated herein by reference) mycelia were infected with the 
recombinant phage by mixing the spores and phage (1 x 10 8 of each), and incubating on 
R2YE agar (Genetic Manipulation of Streptomyces, A Laboratory Manual, edited by D. 
Hopwood et al.) at 30°C for 10 days. Recombinant clones were selected and plated on 
minimal medium containing thiostrepton (50 ug/ml) to select for the thiostrepton 
resistance-conferring gene. Primary thiostrepton resistant clones were isolated and 
purified through a second round of single colony isolation, as necessary. To obtain 
thiostrepton-sensitive revertants that underwent a second recombination event to evict the 
phage genome, primary recombinants were propagated in liquid media for two to three 
days in the absence of thiostrepton and then spread on agar medium without thiostrepton 
to obtain spores. Spores were plated to obtain about 50 colonies per plate, and 
thiostrepton sensitive colonies were identified by replica plating onto thiostrepton 
containing agar medium. The PCR was used to determine which of the thiostrepton 
sensitive colonies reverted to the wild type (reversal of the initial integration event), and 
which contain the desired AT swap at module 8 in the ATCC 14891 -derived cells. The 
PCR primers used amplified either the KS/ AT junction or the AT/DH junction of the 
wild-type and the desired recombinant strains. Fermentation of the recombinant strains, 
followed by isolation of the metabolites and analysis by LCMS, and NMR is used to 
characterize the novel polyketide compounds. 
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Replacement of Methoxyl with Hydrogen or Methyl at C-13 of FK-506 
The present invention also provides the 13-desmethoxy derivatives of FK-506 and 
the novel PKS enzymes that produce them. A variety of Streptomyces strains that produce 
5 FK-506 are known in the art, including S. tsukubaensis No. 9993 (FERM BP-927), 
described in U.S. Patent No. 5,624,852, incorporated herein by reference; S. 
hygroscopicus subsp. yakushimaensis No. 7238, described in U.S. patent No. 4,894,366, 
incorporated herein by reference; S. sp. MA6858 (ATCC 55098), described in U.S. 
Patent Nos. 5,1 16,756, incorporated herein by reference; and S. sp. MA 6548, described 

10 in Motamedi et al, 1998, "The biosynthetic gene cluster for the macrolactone ring of the 
immunosuppressant FK-506," Eur. J. Biochem. 256: 528-534, and Motamedi et al. 9 1997, 
"Structural organization of a multifunctional polyketide synthase involved in the 
biosynthesis of the macrolide immunosuppressant FK-506," Eur J. Biochem. 244: 74-80, 
each of which is incorporated herein by reference. 

1 5 The complete sequence of the FK-506 gene cluster from Streptomyces sp. 

MA6548 is known, and the sequences of the corresponding gene clusters from other FK- 
506-producing organisms is highly homologous thereto. The novel FK-506 recombinant 
gene clusters of the present invention differ from the naturally occurring gene clusters in 
that the AT domain of module 8 of the naturally occurring PKSs is replaced by an AT 

20 domain specific for malonyl CoA or methylmalonyl CoA. These AT domain 

replacements are made at the DNA level, following the methodology described in 
Example 1. 

The naturally occurring module 8 sequence for the MA6548 strain is shown 
below, followed by the illustrative hybrid module 8 sequences for the MA6548 strains. 

25 GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 
MRLYEAARRTGSPVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAAL DDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
30 RTTVRRAAVRERSLAD 

GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
SWNSTATVLGHLGAEDI 
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CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVR LNA 
5 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 400 

TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 

DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
10 TAAAHDEPLAIVGMACR 

CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVASPQELWRL VAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDAITEFPADRGWDV 
15 ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 

DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLDGATGFDAAFFG 
n GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

l :f 20 ISPREALAMDPQQRVL 

TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
to LETSWEAFESAGITPDA 
111 GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

111 ARGSDTGVF IGAFSYGY 

25 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GT GA DTNGFGATG SQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
®, GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

-:f 30 VTVDTACSSSLVALHQA 

4' AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

O GQSLRSGECSLALVGG 
III TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 

D VTVMASPGGFVEFSRQR 
\A 35 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGT ACGCGGCTCCGCG 1250 
40 DAERHGHTVLALVRG SA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVIHQALANAKLTP 
45 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 

ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 

PIEAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
50 PLLLGSLKSNIGHAQA 

CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAI RHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
ELPPTLHADEPSPHVDW 
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GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELL TSARPWPG 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 1700 
TGRPRRAAVS S FGVS GT 
5 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 1750 

NAHI ILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAI EAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
10 GPLPAAPPSAPGEDLPL 

CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LVSARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
RAYLDTGP GVDRAAVA 
1 5 AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 

QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAACTCG 2100 
£13 20 VYSGQGTQHPAMGEQL 

VJ CGGCCGCGTTCCCCGTGTTCGCCGATGCCTGGCACGACGCGCTCCGACGG 2150 

0| AAAFPVFADAWH DALRR 

m CTCGACGACCCCGACCCGCACGACCCCACACGGAGCCAGCACACGCTCTT 2200 

y?. L DDPDPHDPTRSQHTLF 

it 25 CGCCCACCAGGCGGCGTTCACCGCCCTCCTGAGGTCCTGGGACATCACGC 2250 

AHQAAFTAL.LRSWDIT 
! -* CGCACGCCGTCATCGGCCACTCGCTCGGCGAGATCACCGCCGCGTACGCC 2300 

i" 1 -' PHAVIGHSLGEI TAAYA 

s GCCGGGATCCTGTCGCTCGACGACGCCTGCACCCTGATCACCACGCGTGC 2350 

□ 30 AGILSLDDACTLITTRA 

41 CCGCCTCATGCACACGCTTCCGCCGCCCGGCGCCATGGTCACCGTGCTGA 2400 

r!lr RLMHTLP P PGAMVTVL 

CCAGCGAGGAGGAGGCCCGTCAGGCGCTGCGGCCGGGCGTGGAGATCGCC 24 50 
TSEEEARQALRPGVEIA 



35 GCGGTCTTCGGCCCGCACTCCGTCGTGCTCTCGGGCGACGAGGACGCCGT 2500 

AVFGPHSVVLSGDEDAV 
GCTCGACGTCGCACAGCGGCTCGGCATCCACCACCGTCTGCCCGCGCCGC 2550 

LDVAQRLG IHHRLPAP 
ACGCGGGCCACTCCGCGCACATGGAACCCGTGGCCGCCGAGCTGCTCGCC 2600 
40 HAGHSAHMEPVAAELLA 

ACCACTCGCGAGCTCCGTTACGACCGGCCCCACACCGCCATCCCGAACGA 2650 

TTRELRYDRPHTAI PND 
CCCCACCACCGCCGAGTACTGGGCCGAGCAGGTCCGCAACCCCGTGCTGT 2700 
PTTAEYWAEQVRNPVL 
45 TCCACGCCCACACCCAGCGGTACCCCGACGCCGTGTTCGTCGAGATCGGC 2750 

FHAHTQRYPDAVFVEIG 
CCCGGCCAGGACCTCTCACCGCTGGTCGACGGCATCGCCCTGCAGAACGG 2800 

PGQD LS PLVDGIALQNG 
CACGGCGGACGAGGTGCACGCGCTGCACACCGCGCTCGCCCGCCTCTTCA 2850 
50 TADEVHALHTALARLF 

CACGCGGCGCCACGCTCGACTGGTCCCGCATCCTCGGCGGTGCTTCGCGG 2900 
TRGATLDWSRI LGGASR 
CACGACCCTGACGTCCCCTCGTACGCGTTCCAGCGGCGTCCCTACTGGAT 2 950 
HDPDVPSYAFQRRPYWI 
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CGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCA 3000 

E SAP PATADSGHPVLG 
CCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTG 3050 
TGVAVAGSPGRVFTGPV 
5 CCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGC 3100 

PAGAD RAVFIAELALAA 
CGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCG 3150 

ADAT DCATVEQLDVTS 
TGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGAT 3200 
10 V P G G S ARGRAT AQT W • V D 

GAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGG 3250 

EPAADGRRRFTVHTRVG 
CGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCG 3300 
DAPWTLHAEGVLRPGR 
1 5 TGCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTG. 3350 

VPQPEAVDTAWPPPGAV 
CCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGT 3400 

PADGL PGAWRRADQVFV 
CGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGC 34 50 
Q 20 E A E V D S P D G F V A H P D L 

HP TCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGA 3500 

fM LDAVFSAVGDGSRQPTG 
m TGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTG 3550 

|{* WRDLAVHASDATVLRAC 
ff% 25 CCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTG 3600 

%l LTRRDSGVVELAAFD.G 
■ -f CCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCG 3650 

AGMPVLTAESVTLGEVA 
^ TCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTT 37 00 

O 30 SAGGS DESDGLLRLEWL 

4- GCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCT 3750 

PVAEAHYDGADELPEG 
ACACCCTC ATCACCGCCACACACCCCGACGACCCCG ACGACCCCACCAAC 3800 
YTLITATHPDDPDDPTN 
35 CCCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCAC 3850 

PHNT PTRTHTQTTRVLT 
CGCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACA 3900 

ALQHHLITTNHTLIVH 
CCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCA 3950 
40 TTTDPPGAAVTGLTRTA 

C AAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCA 4 000 

QNEH PGRIHL IETHHPH 
CACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTAC 4 050 
TPLPLTQLTTLHQPHL 
45 GCCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACC 4100 

RLTNNTLHTPHLTPITT 
CACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAA 4150 

HHNTTTTTPNTPPLNPN 
CCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCG 4200 
50 HAILITGGSGTLAGIL 

CCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCA 4 250 
ARHLNHPHTYLLSRTPP 
CCCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCAC 4 300 
PPTT PGTHIPCDLTDPT 
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CCAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCT 4350 

QITQALTHIPQPLTGI 
TCCACACCGCCGCC ACCCTCGACGACGCCACCCTCACCAACCTCACCCCC 4 4 00 
FHTAATLDDATLTNLTP 
5 CAACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCT 4 450 

QHLTTTLQPKADAAWHL 
CCACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCA 4 500 

HHHTQNQPLTHFVLYS 
GCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCC 4550 
10 SAAATLGS PGQANYAAA 

AACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACC 4 600 

NAFLDALATHRHTQGQP 
CGCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCA 4 650 
ATT IAWGMWHTTTTLT 
1 5 GCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTG 4 700 

SQLTDSDRDRIRRGGFL 
CCGATCTCGGACGACGAGGGCATGC 
PISDDEGM 

□ 20 The Avrll-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 

?fi s module 12 of rapamycin is shown below. 

Pi i * ' 

! Z GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 

^| M RLY EAARRT G S PVVV 

& * GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

fO 25 AAA LDDAPDVPLLRGLR 

M GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
30 TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGI DSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
35 VQLRNALTTATGVRLNA 

ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 

T AV F D F P T PRA LAARL G 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 450 
D ELAGTRAPVAARTAA 
40 CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAH DE PLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
45 GTDAITE FPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGG FLDGATGFDAAFFG 
50 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 
I SPREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
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LETSWEAFESAGIT PDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 
5 GTGADTNGFGATGSQT 

GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSS SLVALHQA 
1 0 AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 
GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGG FVEFS RQR 
GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
15 GLAP DGRAKAFGAGADG 

TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCAGACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
DAERHGHTVLALVRGSA 
D 20 GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
gjj CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 

pjjj QERVI HQALANAKLT P 

\% CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 1400 

'%1 25 ADV D AV EAH G T G.T R L G D 

CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
\^ PIEAQALLATYGQDRA T 

h h GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

s PLLLGSLKSNIGHAQA 
O 30 CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 

sK ASGVAGI IKMVQAI RHG 

p GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 

f"]; ELPPTLHADEPSPHVDW 
%t GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

35 TAGAVELLTSARPWPG 
• CCGGTCGCCCTAGGCGGGCAGGCGTGTCGTCCTTCGGGATCAGTGGCACC 1700 

TGRPRRAGVSSFG I SGT 
AACGCCCACGTCATCCTGGAAAGCGCACCCCCCACTCAGCCTGCGGACAA 17 50 
NAHVI LESAPPTQPADN 
40 CGCGGTGATCGAGCGGGCACCGGAGTGGGTGCCGTTGGTGATTTCGGCCA 1800 
AVIERAPEWVPLVI SA 
GGACCCAGTCGGCTTTGACTGAGCACGAGGGCCGGTTGCGTGCGTATCTG 1850 
RTQSALTEHEGRLRAYL 
GCGGCGTCGCCCGGGGTGGATATGCGGGCTGTGGCATCGACGCTGGCGAT 1900 
45 AASPGVDMRAVAS TLAM 

GACACGGTCGGTGTTCGAGCACCGTGCCGTGCTGCTGGGAGATGACACCG 1950 

TRSVFEHRAVLLGDDT 
TCACCGGCACCGCTGTGTCTGACCCTCGGGCGGTGTTCGTCTTCCCGGGA 2000 
VTGTAVSDPRAVFV FPG 
50 CAGGGGTCGCAGCGTGCTGGCATGGGTGAGGAACTGGCCGCCGCGTTCCC 2050 
QGSQRAGMGEELAAAFP 
CGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTGCTCGATGTGCCCG 2100 

VFARIHQQVWDLLDVP 
ATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGCCCTGTTCGCAATG 2150 
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DLEVNETGYAQPALFAM 
CAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTGTACGACCGGACGC 2200 

QVALFGLLESWGVRPDA 
GGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCGTATGTGTCCGGGG 2250 
5 VIGHSVGELAAAYVS'G 

TGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGCGCGGGCTCGTCTG 2300 
VWSLEDACT. LVSARARL 
ATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTGTCCCGGTCTCGGA 2350 
M Q A L P A G G VMV AV P V S E 
10 GGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAGATCGCCGCGGTCA 24 00 
DEARAVLGEGVE IAAV 
ACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGCCGCCGTGCTGCAG 24 50 
NGPSSVVLSGDEAAVLQ 
GCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGACCAGCCACGCGTT 2500 
15 AAEGLGKWTRLATSHAF 

CCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTCCGGGCGGTCGCCG 2550 

HSARME PMLEE FRAVA 
AAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGCCGTTGGTGATCAG 2600 
EGLTYRT PQVSMAVGDQ 
20 GTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGGACACGGTCCGGTT 2650 
VTTAEYWVRQVRDTVRF 
CGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTCGTCGAGCTGGGTG 2700 

GEQVAS YEDAVFVELG 
CCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGCGATGCTGCACGGC 27 50 
25 ADRSLARLVDGV AMLHG. 

GACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCCACCTGTATGTCAA 2800 

DHEIQ AAIGALAHLYVN 
CGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGATGCTCCGGCAACAC 2850 
GVTVDW PALLG DAPAT 
30 GGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCAGCGCTACTGGCTC 2900 
RVLDLPTYAFQHQRYWL 
GAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCAC 2950 

ESAPPATADSGHPVLGT 
CGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGC 3000 
35 GVAVAG S PGRVFTG PV 

CCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCC 3050 
PAGAD R A V F I A E LALAA 
GCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGT 3100 
ADATDCATVEQLDVTSV 
40 GCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATG 3150 
PGGSARGRATAQTWVD 
AACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGC 3200 
EPAADGRRRFTVHTRVG 
GACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGT 3250 
45 DAPWTLHAEGVL RPGRV 

GCCCCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGC 3300 

PQPEAVDTAWPPPGAV 
CCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTC 3350 
PADGLPGAWRRADQVFV 
50 GAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCT 3400 
EAEVDS PD GFVAHPDLL 
CGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGAT 3450 

DAVFSAVGDGSRQPTG 
GGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGC 3500 



dc- 176500 



|| PATENT 
AttyDkt: 300622002600 



- 112- 



5 



10 



15 



O 20 

flJ 
S 

□ 30 
4: 

ass. 



a 35 



40 



45 



WRDLAVHASDATVLRAC 
CTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGC 3550 

LTRRDSGVVELAAFDGA 
CGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGT 3600 

GMPV LTAESVTLGEVA 
CGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTG 3650 
SA* GGSDESDGLLRLEWL 
CCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTA 3700 

PVAEAHYDGADE L PEGY 
CACCCTCATCACCGCCACACACCCCGACGACCCCGACGACCCCACCAACC 3750 

TLITATHPDDPDDPTN 
CCCACAACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACC 3800 
PHNT PTRTHTQTTRVLT 
GCCCTCCAACACCACCTCATCACCACCAACCACACCCTCATCGTCCACAC 3850 

ALQHHLITTNHTLIVHT 
CACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCAC 3900 

TTDPPGAAVTGLTRTA 
AAAACGAACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCAC 3950 
QNEHPGRIHLIETHHPH 
ACCCCACTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACG 4000 

TPLPLTQLTTLHQPHLR 
CCTCACCAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCC 4 050 

LTNNTLHTPHLT PITT 
ACCACAACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAAC 4100 
H HNTTTTTPNTPPLNPN 
CACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGC 4150 

HAILITGGSGTLAGILA 
CCGCCACCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCAC 4 200 

RHLNHPHTYLLS RTPP 
CCCCCACCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACC 4250 
PPTTPGTHIPCDLTDPT 
CAAATCACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTT 4300 

QITQALTHIPQPLTGIF 
CCACACCGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCC 4 350 

HTAATLDDATLTNLTP 
AACACCTCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTC 4400 
QHLTTTLQPKADAAWHL 
CACCACCACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAG 4 450 

HHHTQNQPLTHFVLYSS 
CGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCA 4500 

AAATLGS PGQAN YAAA 
ACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCC 4550 
NAFLDALATHRHTQGQP 
GCCACCACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAG 4 600 

ATT IAWGMWHTTTTLTS 
CCAACTCACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGC 4 650 

QLTDSDRDRIRRGGFL 
CGATCTCGGACGACGAGGGCATGC 
PISDDEGM 



The Avrll-Xhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 13 of rapamycin is shown below. 



dc- 176500 




PATENT 

AttyDkt: 300622002600 



PS « 



113- 



GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 50 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 
AAALDDAPDVPLLRGLR 
5 GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERS'LA D 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 
10 SWNSTAT VLGHLGAE D I 

CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATTTFKELGIDSLTA 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
1 5 ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 

DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
O 20 TAAA HDE PLAI VGMAC R 

■Jj CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

fcjj LPGGVASPQELWRLVAS 

CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
GTDA I TEFPADRGWDV 
25 ACGCGCTCT'ACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
D'ALYDPDPDAIGKT FV R 
HJ CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 

H HGGFLDGATGFDAAFFG 
£ GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

□ 30 I S PRE ALAM D PQ Q R V * L 

JZ TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 

f4 LETSWEAFESAGITPDA 

GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
ARGSDTGVFIGAFSYGY 
j"f 35 CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

" ! " : GTGADTNGFGATGSQT 

GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEG PS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
40 VTVDTACSSSLVALHQA 

AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMASPGGFVEFSRQR 
45 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 
50 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNG PS 
CCAGGAACGCGTCATCCACC AGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVI HQALANAKLT P 
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CCGATGTCGACGCGGTCGAGGCGCACGGC ACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 1450 
P I EAQAL LAT YGQDRAT 
5 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
10 ELPP TLHADEPSPHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWPG 
CCGGTCGCCCTAGGCGGGCGGGCGTGTCGTCCTTCGGAGTCAGCGGCACC 1700 
TGRPRRAGVSSFGVSGT 
15 AACGCCCACGTCATCCTGGAGAGCGCACCCCCCGCTCAGCCCGCGGAGGA 17 50 
NAHVILE SAPPAQPAEE 
GGCGCAGCCTGTTGAGACGCCGGTGGTGGCCTCGGATGTGCTGCCGCTGG 1800 

AQPVETPVVASDVLPL 
TGATATCGGCCAAGACCCAGCCCGCCCTGACCGAACACGAAGACCGGCTG 1850 
20 VISAKTQPALTEHEDRL 

CGCGCCTACCTGGCGGCGTCGCCCGGGGCGGATAT ACGGGCTGTGGCATC 1900 

RAYLAAS PGADIRAVAS 
GACGCTGGCGGTGACACGGTCGGTGTTCGAGCACCGCGCCGTACTCCTTG 1950 
TLAVTRSVFEHRAVLL 
25 GAGATGACACCGTCACCGGCACCGCGGTGACCGACCCCAGGATCGTGTTT 2000 
GDDTVTGTAVTDPRIVF 
GTCTTTCCCGGGCAGGGGTGGCAGTGGCTGGGGATGGGCAGTGCACTGCG 2050 

VFPGQGWQWLGMGSALR 
CGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCGT 2100 
30 DSSVVFAERMAECAAA 

TGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGCG 2150 
LREFVDWDLFTVLDDPA 
GTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGGT 2200 
VVDRVDVVQPASWAMMV 
35 TTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTGA 2250 
SLAAVWQAAGVRPDAV 
TCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGTG 2300 
IGHSQGE IAAACVAGAV 
TCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCGC 2350 
40 SLRDAARIVTLRSQAIA 

CCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGGGC 2400 

RGLAGRGAMASVALPA 
AGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCCC 2450 
QDVELVDGAWIAAHNGP 
45 GCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCAC 2500 
ASTVIAGTPEAVDHVLT 
CGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTATG 2550 

AHEAQGVRVRRI TVDY 
CCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACATC 2 600 
50 ASHTPHVELIRDELLDI 

ACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCGT 2650 

TSDSSSQTPLVPWLSTV 
GGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGGA 2700 
DGTWVDSPLDGEYWYR 
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ACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGCC 2750 
NLREPVGFHPAVSQLQA 
CAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGCA 2800 
QGDTVFVEVSASPVLLQ 
5 GGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGACG 2850 
AMDDDVVTVATLRRDD 
GCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGGC 2900 
G DATRMLTALAQAYVHG 
GTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTACT 2950 
10 VTVDWPAILGTTTTRVL 

GGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCGG 3000 

DLPTYAFQHQRYWLES 
CTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGTC 3050 
APPATADSGHPVLGTGV 
15 GCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCGG 3100 
AVAGS PGRVFTGPVPAG 
TGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGACG 3150 

ADRAVFIAELALAAAD 
CCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGGC 3200 
p 20 ATDCATVEQLDVTSVPG 

"1% GGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCGC 3250 

f% GSARGRATAQTWVDE PA 

^ CGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCCC 3300 

f-f ADGRRRFTVHTRVG DA 

U1 25 ■ CGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCAG 3350. 

£tt PWT LHAEGVLRPGRV P Q 

f0 CCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGGA 34 00 

PEAVDTAWPPPGAVPAD 
r ' ; CGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCCG 34 50 

30 GLPGAWRRADQVFVEA 

AAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGCG 3500 
*[* EVDSPDGFVAHPDLLDA 
^ GTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCGA 3550 

y VFSAVGDGSRQPTGWRD 
P 35 CCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACCC 3600 

LAVHASDATVLRACLT 
GCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAATG 3650 
RRDSGVVELAAFDGAGM 
CCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAGG 3700 
40 PVLTAESVTLGEVASAG 

CGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTGG 37 50 

GSDESDGLLRLEWLPV 
CGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCTC 3800 
AEAHYDGADELPEGYTL 
45 ATC ACCGCCACACACCCCGACGACCCCGACGACCCCACCAACCCCCACAA 3850 
ITATHPDDPDDPTNPHN 
C AC ACCCAC ACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTCC 3900 

TPTRTHTQTTRVLTAL 
AACACCACCTCATCACCACCAACCACACCCTCATCGTCCACACCACCACC 3950 
50 QHHLI TTNHTLIVHTTT 

GACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCAC AAAACGA 4 000 

DPPGAAVTGLTRTAQNE 
ACACCCCGGCCGCATCCACCTCATCGAAACCCACCACCCCCACACCCCAC 4050 
HPGRIHLIETHHPHTP 
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TCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCACC 4100 
LPLTQLTTLHQPHLRLT 
AACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACAA 4150 
NNTLHTPHLT PITTHHN 
5 CACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCCA 4200 
TTTTTPNTPPLNPNHA 
TCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCAC 4 250 
ILITGGSGTLAGILARH 
CTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCAC 4300 
10 LNHPHTYLLSRTPPPPT 

CACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATCA 4 350 

TPGTHIPCDLTDPTQI 
CCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACACC 4 400 
TQALTHI PQPLTGIFHT 
1 5 GCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACCT 4 4 50 
AATLDDATLTNLTPQHL 
CACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCACC 4500 

TTTLQPKADAAWHLHH 
ACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGCC 4 550 
20 HTQNQPLTHFVLYSSAA 

GCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCTT 4 600 

ATLGS PGQANYAAANAF 
CCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACCA 4600 
LDALATHRHTQGQPAT 
25 CCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACTC 4 7 00 
T I AWG MW H T T T T LT SQ.L 
ACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCTC 4 750 

TDSDRD RIRRGGFLPIS 
GGACGACGAGGGCATGC 
30 D D E G M 



The Nhe\-Xho\ hybrid FK-506 PKS module 8 containing the AT domain of 
module 12 of rapamycin is shown below. 



GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 
35 MRLY EAARRTGS PVVV 

GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 
RTTVRRAAVRERSLAD 
40 GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 
45 PATTTFKELGIDSLTA 

TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA. 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 4 00 
TAVFDFPTPRALAARLG 
50 CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 
DELAGTRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
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TAAAHDEPLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 
5 GTDAITEFPADRGWDV 

ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKT FVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 700 
HGGFLDGATGFDAAFFG 
1 0 GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 
ISPREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGI TPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 
15 ARGSDTGVFIGAFSYGY 

CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
□ 20 GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 

In VTVDTACSSSLVALHQA 
^ :% AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

«5« GQSLRSGECSLALVGG 
\H TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 

M) 25 VTVMAS PGGFVEFSR.QR 

0^ GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 

HI GLAPDGRAKAFGAGADG 
jh* TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

s TS FA.E GAGALVV E RL S 

P 30 ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 1250 

*P DAERHGHTVLALVRGSA 
pU. GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

J 1 ^ ANSDGASNGLSAPNGP.S 

CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1 3 5 0 
P 35 QERVI HQALANAKLT P 

H : CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 

A.DVDAVEAHGTGTRLG D 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 
PIEAQALLATYGQDRAT 
40 GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 
PLLLGSLKSNIGHAQA 
CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 
ASGVAGI IKMVQAIRHG 
GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
45 ELPPTLHADEPSPHVDW 

GACGGCCGGTGCCGTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

TAGAVELLTSARPWP. G 
CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 1700 
TGRPRRAAVSSFGVSGT 
50 AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 
NAHIILEAGPVKTGPVE 
GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

AGAI EAGPVEVGPVEA 
GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
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GPLPAAPPSAPGEDLPL 
CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

LVSARSPEALDEQIGRL 
GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 
5 RAYLDTGPGVDRAAVA 

AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
QTLARRTHFTHRAVLLG 
GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 
DTVIGAPP ADQADELVF 
10 CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VYSGQGTQHPA MGEQL 
CCGCCGCGTTCCCCGTCTTCGCGCGGATCCATCAGCAGGTGTGGGACCTG 2150 
AAAFPVFARI HQQVWDL 
CTCGATGTGCCCGATCTGGAGGTGAACGAGACCGGTTACGCCCAGCCGGC 2200 
15 LDVPDLEVNETGYAQPA 

CCTGTTCGCAATGCAGGTGGCTCTGTTCGGGCTGCTGGAATCGTGGGGTG 2250 

L FAMQVAL FG L LE SWG 
TACGACCGGACGCGGTGATCGGCCATTCGGTGGGTGAGCTTGCGGCTGCG 2300 
VRPDAVI GHSVGELAAA 
G 20 TATGTGTCCGGGGTGTGGTCGTTGGAGGATGCCTGCACTTTGGTGTCGGC 2350 

fcS YVSGVWSLEDACTLVSA 
m GCGGGCTCGTCTGATGCAGGCTCTGCCCGCGGGTGGGGTGATGGTCGCTG 24 00 

m RARLMQAL PAGGVMVA 

[|« . TCCCGGTCTCGGAGGATGAGGCCCGGGCCGTGCTGGGTGAGGGTGTGGAG 24 50 

Jil 25 VPV SEDEARAVLGEG 'VE 

pf' ATCGCCGCGGTCAACGGCCCGTCGTCGGTGGTTCTCTCCGGTGATGAGGC 2500 

\ U IAAVNGPSSVVLSGDEA 

CGCCGTGCTGCAGGCCGCGGAGGGGCTGGGGAAGTGGACGCGGCTGGCGA 2550 
s AVLQAAEGLGKWTRLA 
P 30 CCAGCCACGCGTTCCATTCCGCCCGTATGGAACCCATGCTGGAGGAGTTC 2600 

JZ TSHAFHSARME PMLEEF 

p CGGGCGGTCGCCGAAGGCCTGACCTACCGGACGCCGCAGGTCTCCATGGC 2650 

fji RAVAEGLTYRT PQVSMA 

CGTTGGTGATCAGGTGACCACCGCTGAGTACTGGGTGCGGCAGGTCCGGG 27 00 
35 VGDQVTTAEYWVRQVR 

ACACGGTCCGGTTCGGCGAGCAGGTGGCCTCGTACGAGGACGCCGTGTTC 2750 
DTVRFGEQVAS YEDAVF 
GTCGAGCTGGGTGCCGACCGGTCACTGGCCCGCCTGGTCGACGGTGTCGC 2800 
VELGADRSLARLVDGVA 
40 GATGCTGCACGGCGACCACGAAATCCAGGCCGCGATCGGCGCCCTGGCCC 2850 
MLHG DHE I QAAI GALA 
ACCTGTATGTCAACGGCGTCACGGTCGACTGGCCCGCGCTCCTGGGCGAT 2900 
HLYVNGVTVDW PALLGD 
GCTCCGGCAACACGGGTGCTGGACCTTCCGACATACGCCTTCCAGCACCA 2950 
45 APATRVLDLPTYAFQHQ 

GCGCTACTGGCTCGAGTCGGCTCCCCCGGCCACGGCCGACTCGGGCCACC 3000 

RYWLESAPPATADSGH 
CCGTCCTCGGCACCGGAGTCGCCGTCGCCGGGTCGCCGGGCCGGGTGTTC 3050 
PVLGTGVAVAGSPGRVF 
50 ACGGGTCCCGTGCCCGCCGGTGCGGACCGCGCGGTGTTCATCGCCGAACT 3100 
TGPVPAGADRAVFIAEL 
GGCGCTCGCCGCCGCCGACGCCACCGACTGCGCCACGGTCGAACAGCTCG 3150 

ALAAADAT DCATVEQL 
ACGTCACCTCCGTGCCCGGCGGATCCGCCCGCGGCAGGGCCACCGCGCAG 3200 
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DVTSVPGGSARGRATAQ 
ACCTGGGTCGATGAACCCGCCGCCGACGGGCGGCGCCGCTTCACCGTCCA 3250 

TWVDE PAADGRRRFTVH 
CACCCGCGTCGGCGACGCCCCGTGGACGCTGCACGCCGAGGGGGTTCTCC 3300 
5 TRVGDAPWTLHAEGVL 

GCCCCGGCCGCGTGCCGCAGCCCGAAGCCGTCGACACCGCCTGGCCCCCG 3350 
RPGRVPQPEAVDTAWPP 
CCGGGCGCGGTGCCCGCGGACGGGCTGCCCGGGGCGTGGCGACGCGCGGA 3400 
PGAVPADGLPGAWRRAD 
10 CCAGGTCTTCGTCGAAGCCGAAGTCGACAGCCCTGACGGCTTCGTGGCAC 34 50 
QVFVEAEVDSPDGFVA 
ACCCCGACCTGCTCGACGCGGTCTTCTCCGCGGTCGGCGACGGGAGCCGC 3500 
HPDLLDAVFSAVGDGSR 
CAGCCGACCGGATGGCGCGACCTCGCGGTGCACGCGTCGGACGCCACCGT 3550 
15 QPTGWRDLAVHAS DATV 

GCTGCGCGCCTGCCTCACCCGCCGCGACAGTGGTGTCGTGGAGCTCGCCG 3600 

LRACLT RRDSGVVELA 
CCTTCGACGGTGCCGGAATGCCGGTGCTCACCGCGGAGTCGGTGACGCTG 3650 
AFDGAGMPVLTAESVTL 
O 20 GGCGAGGTCGCGTCGGCAGGCGGATCCGACGAGTCGGACGGTCTGCTTCG 3700 

GEVASAGGSDESDGLLR 
pj GCTTGAGTGGTTGCCGGTGGCGGAGGCCCACTACGACGGTGCCGACGAGC 3750 

LEWLPVAEAHYDGADE 
TGCCCGAGGGCTACACCCTCATCACCGCCACACACCCCGACGACCCCGAC 3800 
^ 25 LPEGY TLI TA THPD DPD 

j GACCCC ACCAACCCCC ACAACACACCCACACGCACCCACACAC AAACCAC 3850 

|'f DPTNPHNTPTRTHTQTT 
p!Si ACGCGTCCTCACCGCCCTCCAACACCACCTCATCACCACCAACCACACCC 3900 

RVLTALQHHLITTNHT 
O 30 TCATCGTCCACACCACCACCGACCCCCCAGGCGCCGCCGTCACCGGCCTC 3950 

4* LIVHTTTDPPGAAVTGL 
rj ACCCGCACCGCACAAAACGAACACCCCGGCCGCATCCACCTCATCGAAAC 4 000 

y TRTAQNEHPGRIHLIET 

CCACCACCCCCACACCCCACTCCCCCTCACCCAACTCACCACCCTCCACC 4 050 
35 HHPHTPLPLTQLTTLH 

AACCCCACCTACGCCTCACCAACAACACCCTCCACACCCCCCACCTCACC 4100 
QPHLRLTNNTLHTPHLT 
CCCATCACCACCCACCACAACACCACCACAACCACCCCCAACACCCCACC 4150 
PITTHHNTTTTTPNTPP 
40 CCTCAACCCCAACCACGCCATCCTCATCACCGGCGGCTCCGGCACCCTCG 4200 
LNPNHAILITGGSGTL 
CCGGCATCCTCGCCCGCCACCTCAACCACCCCCACACCTACCTCCTCTCC 4 250 
AGI LARHLN HPHTYLLS 
CGCACACCACCACCCCCCACCACACCCGGCACCCACATCCCCTGCGACCT 4 300 
45 RTPPPPTTPGTHI PCDL 

CACCGACCCCACCCAAATCACCCAAGCCCTCACCCACATACCACAACCCC 4 350 

TDPTQITQALTHI PQP 
TCACCGGCATCTTCCACACCGCCGCCACCCTCGACGACGCCACCCTCACC 4 400 
LTGI FHTAATLDDATLT 
50 AACCTCACCCCCCAACACCTCACCACCACCCTCCAACCCAAAGCCGACGC 4 450 
NLTPQHLTTTLQPKADA 
CGCCTGGCACCTCCACCACCACACCCAAAACCAACCCCTCACCCACTTCG 4500 

AWHLHHHTQNQPLTHF 
TCCTCTACTCCAGCGCCGCCGCCACCCTCGGCAGCCCCGGCCAAGCCAAC 4 550 
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VLYSSAAATLGS PGQAN 
TACGCCGCCGCCAACGCCTTCCTCGACGCCCTCGCCACCCACCGCCACAC 4 600 

YAAANAFLDALATHRHT 
CCAAGGACAACCCGCCACCACCATCGCCTGGGGCATGTGGCACACCACCA 4 650 

QGQPATT IAWGMWHTT 
CCACACTCACCAGCCAACTCACCGACAGCGACCGCGACCGCATCCGCCGC 47 00 
TTLTSQLTDSDRDRIRR 
GGCGGCTTCCTGCCGATCTCGGACGACGAGGGCATGC 

GGFLPISDDEGM 



The NhehXhol hybrid FK-506 PKS module 8 containing the AT domain of 
module 13 of rapamycin is shown below. 

GCATGCGGCTGTACGAGGCGGCACGGCGCACCGGAAGTCCCGTGGTGGTG 5 0 

MRLYEAARRTGS PVVV 
GCGGCCGCGCTCGACGACGCGCCGGACGTGCCGCTGCTGCGCGGGCTGCG 100 

AAALDDAPDVPLLRGLR 
GCGTACGACCGTCCGGCGTGCCGCCGTCCGGGAACGCTCTCTCGCCGACC 150 

RTTVRRAAVRERSLAD 
GCTCGCCGTGCTGCCCGACGACGAGCGCGCCGACGCCTCCCTCGCGTTCG 200 
RSPCCPTTSAPTPPSRS 
TCCTGGAACAGCACCGCCACCGTGCTCGGCCACCTGGGCGCCGAAGACAT 250 

SWNSTATVLGHLGAEDI 
CCCGGCGACGACGACGTTCAAGGAACTCGGCATCGACTCGCTCACCGCGG 300 

PATT T FKELGIDS.LT A 
TCCAGCTGCGCAACGCGCTGACCACGGCGACCGGCGTACGCCTCAACGCC 350 
VQLRNALTTATGVRLNA 
ACAGCGGTCTTCGACTTTCCGACGCCGCGCGCGCTCGCCGCGAGACTCGG 400 

TAVFDFPTPRALAARLG 
CGACGAGCTGGCCGGTACCCGCGCGCCCGTCGCGGCCCGGACCGCGGCCA 4 50 

DELAG TRAPVAARTAA 
CCGCGGCCGCGCACGACGAACCGCTGGCGATCGTGGGCATGGCCTGCCGT 500 
TAAAHDE PLAIVGMACR 
CTGCCGGGCGGGGTCGCGTCGCCACAGGAGCTGTGGCGTCTCGTCGCGTC 550 

LPGGVAS PQELWRLVAS 
CGGCACCGACGCCATCACGGAGTTCCCCGCGGACCGCGGCTGGGACGTGG 600 

GTDAITEFPADRGWDV 
ACGCGCTCTACGACCCGGACCCCGACGCGATCGGCAAGACCTTCGTCCGG 650 
DALYDPDPDAIGKTFVR 
CACGGCGGCTTCCTCGACGGTGCGACCGGCTTCGACGCGGCGTTCTTCGG 7 00 

HGGFLDGATGFDAAFFG 
GATCAGCCCGCGCGAGGCCCTGGCCATGGACCCGCAGCAACGGGTGCTCC 750 

I S PREALAMDPQQRVL 
TGGAGACGTCCTGGGAGGCGTTCGAAAGCGCGGGCATCACCCCGGACGCG 800 
LETSWEAFESAGITPDA 
GCGCGGGGCAGCGACACCGGCGTGTTCATCGGCGCGTTCTCCTACGGGTA 850 

ARGSDTGVFIGAFSYGY 
CGGCACGGGTGCGGATACCAACGGCTTCGGCGCGACAGGGTCGCAGACCA 900 

GTGADTNGFGATGSQT 
GCGTGCTCTCCGGCCGCCTCTCGTACTTCTACGGTCTGGAGGGCCCTTCG 950 
SVLSGRLSYFYGLEGPS 
GTCACGGTCGACACCGCCTGCTCGTCGTCACTGGTCGCCCTGCACCAGGC 1000 
VTVDTACSSSLVALHQA 
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AGGGCAGTCCCTGCGCTCGGGCGAATGCTCGCTCGCCCTGGTCGGCGGTG 1050 

GQSLRSGECSLALVGG 
TCACGGTGATGGCGTCGCCCGGCGGATTCGTCGAGTTCTCCCGGCAGCGC 1100 
VTVMAS PGGFVEFSRQR 
5 GGGCTCGCGCCGGACGGGCGGGCGAAGGCGTTCGGCGCGGGCGCGGACGG 1150 
GLAPDGRAKAFGAGADG 
TACGAGCTTCGCCGAGGGCGCCGGTGCCCTGGTGGTCGAGCGGCTCTCCG 1200 

TSFAEGAGALVVERLS 
ACGCGGAGCGCCACGGCCACACCGTCCTCGCCCTCGTACGCGGCTCCGCG 12 50 
10 DAERHGHTVLALVRGSA 

GCTAACTCCGACGGCGCGTCGAACGGTCTGTCGGCGCCGAACGGCCCCTC 1300 

ANSDGASNGLSAPNGPS 
CCAGGAACGCGTCATCCACCAGGCCCTCGCGAACGCGAAACTCACCCCCG 1350 
QERVI HQALANAKLT P 
15 CCGATGTCGACGCGGTCGAGGCGCACGGCACCGGCACCCGCCTCGGCGAC 14 00 
ADVDAVEAHGTGTRLGD 
CCCATCGAGGCGCAGGCGCTGCTCGCGACGTACGGACAGGACCGGGCGAC 14 50 

PIEAQALLATYGQDRAT 
GCCCCTGCTGCTCGGCTCGCTGAAGTCGAACATCGGGCACGCCCAGGCCG 1500 

□ 20 PLLLGSLKSNI GHAQA 

iJS CGTCAGGGGTCGCCGGGATCATCAAGATGGTGCAGGCCATCCGGCACGGG 1550 

^ ASGVAGI I KMVQAIRHG 

GAACTGCCGCCGACACTGCACGCGGACGAGCCGTCGCCGCACGTCGACTG 1600 
It ELPPTLHADEPSPHVDW 
J!* 25 GACGGCCGGTGCC'GTCGAGCTCCTGACGTCGGCCCGGCCGTGGCCGGGGA 1650 

jjjjj T.AGAVELLTSARPWPG 
TfJ CCGGTCGCCCGCGCCGCGCTGCCGTCTCGTCGTTCGGCGTGAGCGGCACG 17 00 

b'~ TGRPRRAAVSS FGVSGT 

h AACGCCCACATCATCCTTGAGGCAGGACCGGTCAAAACGGGACCGGTCGA 17 50 

□ 30 NAHIILEAGPVKTGPVE 

J* GGCAGGAGCGATCGAGGCAGGACCGGTCGAAGTAGGACCGGTCGAGGCTG 1800 

p A G A I E A G P V E V G P V E A 

GACCGCTCCCCGCGGCGCCGCCGTCAGCACCGGGCGAAGACCTTCCGCTG 1850 
GPLPAAPPSAPGEDLPL 
j«* 35 CTCGTGTCGGCGCGTTCCCCGGAGGCACTCGACGAGCAGATCGGGCGCCT 1900 

^ LVSARSPEALDEQIGRL 

GCGCGCCTATCTCGACACCGGCCCGGGCGTCGACCGGGCGGCCGTGGCGC 1950 

RAYLDTGPGVDRAAVA 
AGACACTGGCCCGGCGTACGCACTTCACCCACCGGGCCGTACTGCTCGGG 2000 
40 QTLARRTHFTHRAVLLG 

GACACCGTCATCGGCGCTCCCCCCGCGGACCAGGCCGACGAACTCGTCTT 2050 

DTVIGAPPADQADELVF 
CGTCTACTCCGGTCAGGGCACCCAGCATCCCGCGATGGGCGAGCAGCTAG 2100 
VY'SGQGTQHPAMGEQL 
45 CCGATTCGTCGGTGGTGTTCGCCGAGCGGATGGCCGAGTGTGCGGCGGCG 2150 
ADSSVVFAERMAECAAA 
TTGCGCGAGTTCGTGGACTGGGATCTGTTCACGGTTCTGGATGATCCGGC 2200 

LREFVDWDLFTVLDDPA 
GGTGGTGGACCGGGTTGATGTGGTCCAGCCCGCTTCCTGGGCGATGATGG 2250 
50 VVDRVDVVQPASWAMM 

TTTCCCTGGCCGCGGTGTGGCAGGCGGCCGGTGTGCGGCCGGATGCGGTG 2300 
VSLAAVWQAAGVRP DAV 
ATCGGCCATTCGCAGGGTGAGATCGCCGCAGCTTGTGTGGCGGGTGCGGT 2350 
IGHSQGE IAAACVAGAV 
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GTCACTACGCGATGCCGCCCGGATCGTGACCTTGCGCAGCCAGGCGATCG 2400 

SLRDAARIVTLRSQAI 
CCCGGGGCCTGGCGGGCCGGGGCGCGATGGCATCCGTCGCCCTGCCCGCG 2450 
ARGLAGRGAMASVALP A 
5 CAGGATGTCGAGCTGGTCGACGGGGCCTGGATCGCCGCCCACAACGGGCC 2500 
QDVELVDGAWIAAHNGP 
CGCCTCCACCGTGATCGCGGGCACCCCGGAAGCGGTCGACCATGTCCTCA 2550 

ASTVIAGT PEAVDHVL 
CCGCTCATGAGGCACAAGGGGTGCGGGTGCGGCGGATCACCGTCGACTAT 2600 
10 TAHEAQGVRVRRITVDY 

GCCTCGCACACCCCGCACGTCGAGCTGATCCGCGACGAACTACTCGACAT 2650 

ASHTPHVELIRDELLDI 
CACTAGCGACAGCAGCTCGCAGACCCCGCTCGTGCCGTGGCTGTCGACCG 2700 
TSDSSSQTPLVPWLST 
1 5 TGGACGGCACCTGGGTCGACAGCCCGCTGGACGGGGAGTACTGGTACCGG 2750 
VDG TWVDSPLDGEYWYR 
AACCTGCGTGAACCGGTCGGTTTCCACCCCGCCGTCAGCCAGTTGCAGGC 2800 

NLREPVGFHPAVSQLQA 
CCAGGGCGACACCGTGTTCGTCGAGGTCAGCGCCAGCCCGGTGTTGTTGC 2850 
20 QGDTV F V E V SAS PVLL 

AGGCGATGGACGACGATGTCGTCACGGTTGCCACGCTGCGTCGTGACGAC 2 900 
QAMDDDVVTVATLRRDD 
GGCGACGCCACCCGGATGCTCACCGCCCTGGCACAGGCCTATGTCCACGG 2 950 
GDATRMLTALAQAYVHG 
pj% 25 CGTCACCGTCGACTGGCCCGCCATCCTCGGCACCACCACAACCCGGGTAC 3000 

VTVDW.PAI LGTTTTRV 
^ TGGACCTTCCGACCTACGCCTTCCAACACCAGCGGTACTGGCTCGAGTCG 3050 

l"= LDLPTYAFQHQRYWLES 
s GCTCCCCCGGCCACGGCCGACTCGGGCCACCCCGTCCTCGGCACCGGAGT 3100 

p 30 APPATADSGHPVLGTGV 

4: CGCCGTCGCCGGGTCGCCGGGCCGGGTGTTCACGGGTCCCGTGCCCGCCG 3150 

O AVAGS PGRVFTGPVPA 

GTGCGGACCGCGCGGTGTTCATCGCCGAACTGGCGCTCGCCGCCGCCGAC 3200 
GADRAVFIAELALAAAD 
35 GCCACCGACTGCGCCACGGTCGAACAGCTCGACGTCACCTCCGTGCCCGG 3250 
ATDCATVEQLDVTSVPG 
CGGATCCGCCCGCGGCAGGGCCACCGCGCAGACCTGGGTCGATGAACCCG 3300 

GSARGRATAQTWVDEP 
CCGCCGACGGGCGGCGCCGCTTCACCGTCCACACCCGCGTCGGCGACGCC 3350 
40 AADGRRRFTVHTRVGDA 

CCGTGGACGCTGCACGCCGAGGGGGTTCTCCGCCCCGGCCGCGTGCCCCA 34 00 

PWTLHAEGVLRPGRVPQ 
GCCCGAAGCCGTCGACACCGCCTGGCCCCCGCCGGGCGCGGTGCCCGCGG 3450 
PEAV DTAWPPPGAVPA 
45 ACGGGCTGCCCGGGGCGTGGCGACGCGCGGACCAGGTCTTCGTCGAAGCC 3500 
DGLPGAWRRADQVFVEA 
GAAGTCGACAGCCCTGACGGCTTCGTGGCACACCCCGACCTGCTCGACGC 3550 

EVDSPDGFVAHPDLLDA 
GGTCTTCTCCGCGGTCGGCGACGGGAGCCGCCAGCCGACCGGATGGCGCG 3600 
50 VFSAVGDGSRQPTGWR 

ACCTCGCGGTGCACGCGTCGGACGCCACCGTGCTGCGCGCCTGCCTCACC 3650 
DLAVHAS DATVLRACLT 
CGCCGCGACAGTGGTGTCGTGGAGCTCGCCGCCTTCGACGGTGCCGGAAT 3700 
RRDSGVVELAAFDGAGM 
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GCCGGTGCTCACCGCGGAGTCGGTGACGCTGGGCGAGGTCGCGTCGGCAG 3750 

PVLTAESVTLGEVASA 
GCGGATCCGACGAGTCGGACGGTCTGCTTCGGCTTGAGTGGTTGCCGGTG 3800 
GGSDESDGLLRLEWLPV 
5 GCGGAGGCCCACTACGACGGTGCCGACGAGCTGCCCGAGGGCTACACCCT 3850 
AEAHYDGADELPEGYTL 
CATC ACCGCC ACACACCCCGACGACCCCGACGACCCCACCAACCCCCACA 3900 

I TATHPDDPDDPTNPH 
ACACACCCACACGCACCCACACACAAACCACACGCGTCCTCACCGCCCTC 3950 
10 NT PTRTHTQTTRVLTAL 

CAACACCACCTC ATC ACC ACC AACCACACCCTCATCGTCC ACACCACCAC 4000 

QHHLITTNHTLIVHTTT 
CGACCCCCCAGGCGCCGCCGTCACCGGCCTCACCCGCACCGCACAAAACG 4050 
DPPGAAVTGLTRTAQN 
1 5 AAC ACCCCGGCCGCATCCACCTCATCGAAACCCACC ACCCCC ACACCCCA 4100 
EHPGRI HLIETHHPHTP 
CTCCCCCTCACCCAACTCACCACCCTCCACCAACCCCACCTACGCCTCAC 4150 

LPLTQLTTLHQPHLRLT 
CAACAACACCCTCCACACCCCCCACCTCACCCCCATCACCACCCACCACA 4200 
20 NNTLH' TPHLTPITTHH 

ACACCACCACAACCACCCCCAACACCCCACCCCTCAACCCCAACCACGCC 4 250 
NTTTTT PNTPPLNPNHA 
ATCCTCATCACCGGCGGCTCCGGCACCCTCGCCGGCATCCTCGCCCGCCA 4 300 
I LITGGSGTLAGILARH 
25 CCTCAACCACCCCCACACCTACCTCCTCTCCCGCACACCACCACCCCCCA 4350 
LNHPHTYLLSRTPP PP 
CCACACCCGGCACCCACATCCCCTGCGACCTCACCGACCCCACCCAAATC 4 4 00 
TTPGTHI PCDLTDPTQI 
ACCCAAGCCCTCACCCACATACCACAACCCCTCACCGGCATCTTCCACAC 4 4 50 
30 TQALTHIPQPLTGIFH T 

CGCCGCCACCCTCGACGACGCCACCCTCACCAACCTCACCCCCCAACACC 4500 

AATLDDATLTNLTPQH 
TCACCACCACCCTCCAACCCAAAGCCGACGCCGCCTGGCACCTCCACCAC 4550 
LTTTLQ PKADAAWHLHH 
35 CACACCCAAAACCAACCCCTCACCCACTTCGTCCTCTACTCCAGCGCCGC 4 600 
HTQNQPLTHFVLYSSAA 
CGCCACCCTCGGCAGCCCCGGCCAAGCCAACTACGCCGCCGCCAACGCCT 4 650 

ATLGS PGQANYAAANA 
TCCTCGACGCCCTCGCCACCCACCGCCACACCCAAGGACAACCCGCCACC 4700 
40 FLDALATHRHTQGQPAT 

ACCATCGCCTGGGGCATGTGGCACACCACCACCACACTCACCAGCCAACT 4750 

TIAWGMWHTTTTLTSQL 
CACCGACAGCGACCGCGACCGCATCCGCCGCGGCGGCTTCCTGCCGATCT 4800 
TDSDR DRIRRGGFLPI 
45 CGGACGACGAGGGCATGC 
S D D E G M 

Example 3 

Recombinant PKS Genes for 13-desmethoxy FK-506 and FK-520 
50 The present invention provides a variety of recombinant PKS genes in addition to 

those described in Examples 1 and 2 for producing 13-desmethoxy FK-506 and FK-520 
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compounds. This Example provides the construction protocols for recombinant FK-520 
and FK-506 (from Streptomyces sp. MA6858 (ATCC 55098), described in U.S. Patent 
Nos. 5,1 16,756, incorporated herein by reference) PKS genes in which the module 8 AT 
coding sequences have been replaced by either the rap AT3 (the AT domain from module 
5 3 of the rapamycin PKS), rapAT12, eryklX (the AT domain from module 1 of the 
erythromycin (DEBS) PKS), or erykll coding sequences. Each of these constructs 
provides a PKS that produces the 13-desmethoxy-13-methyl derivative, except for the 
rapAT12 replacement, which provides the 13-desmethoxy derivative, i.e., it has a 
hydrogen where the other derivatives have methyl. 
10 Figure 7 shows the process used to generate the AT replacement constructs. First, 

a fragment of -4.5 kb containing module 8 coding sequences from the FK-520 cluster of 
ATCC 14891 was cloned using the convenient restriction sites Sad and Sphl (Step A in 
Figure 7). The choice of restriction sites used to clone a 4.0 - 4.5 kb fragment comprising 
module 8 coding sequences from other FK-520 or FK-506 clusters can be different 
15 depending on the DNA sequence, but the overall scheme is identical. The unique Sacl 

i 'is* 

^ and Sphl restriction sites at the ends of the FK-520 module 8 fragment were then changed 

p to unique Bgl II and Nsil sites by ligation to synthetic linkers (described in the preceding 

^ Examples, see Step B of Figure 7). Fragments containing sequences 5' and 3' of the AT8 

W sequences were then amplified using primers, described above, that introduced either an 

ii 20 A vrll site or an Nhel site at two different KS/AT boundaries and an Xhol site at the 

AT/DH boundary (Step C of Figure 7). Heterologous AT domains from the rapamycin 
and erythromycin gene clusters were amplified using primers, as described above, that 
introduced the same sites as just described (Step D of Figure 7). The fragments were 
ligated to give hybrid modules with in-frame fusions at the KS/AT and AT/DH 
25 boundaries (Step E of Figure 7). Finally, these hybrid modules were ligated into the 

BamHl and Pstl sites of the KC515 vector. The resulting recombinant phage were used to 
transform the FK-506 and FK-520 producer strains to yield the desired recombinant cells, 
as described in the preceding Examples. 
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The following table shows the location and sequences surrounding the engineered 
site of each of the heterologous AT domains employed. The FK-506 hybrid construct was 
used as a control for the FK-520 recombinant cells produced, and a similar FK-520 
hybrid construct was used as a control for the FK-506 recombinant cells. 



Heterologous AT 


Enzyme 


Location of Engineered Site 


FK-506 AT8 

(hydroxymalonyl) 


Avrll 
Nhel 
Xhol 


GGCCGTccqcqcCGTGCGGCGGTCTCGTCGTTC 
GRPRRAAVSSF 

Ar.CCAGCATCCCGCGATGGGTGAGCGqctcgcC 
TQHPA-MGERLA 

TACGCCTTCCAGCGGCGGCCCTACTGGatcgag 
YAFQRRPYWIE 


rapamycin AT3 

/ , i i i i\ 
(methylmalonyl) 


Avrll 
Nhel 
Xhol 


GACCGGccccqtCGGGCGGGCGTGTCGTCCTTC 
DRPRRAGVSSF 

TGGCAGTGGCTGGGGATGGGCAGTGCcctgcgG 
WQWLGMGSALR 

TACGCCTTCCAACACCAGCGGTACTGGqtcgag 
YAFQHQRYWVE 


rapamycin AT 12 

/ rviQ lr*\n\/M 
^IllalUIiy 1 ) 


Avrll 
Nhel 
Xhol 


GGCCGAgcgcgcCGGGCAGGCGTGTCGTCCTTC 
GRARRAGVSSF 

TCGCAGCGTGCTGGCATGGGTGAGGAactggcC 
SQRAGMGEELA 

TACGCCTTCCAGCACCAGCGCTACTGGctcaaa 
YAFQHQRYWLE 


DEBS ATI 
(methylmalonyl) 


Avrll 
Nhel 
Xhol 


GCGCGAccgcgcCGGGCGGGGGTCTCGTCGTTC 
ARPRRAGVSSF 

TGGCAGTGGGCGGGCATGGCCGTCGAcctgctC 
WQWAGMAV'DLL 

TACCCGTTCCAGCGCGAGCGCGTCTGGctcgaa 
YPFQRERVWLE 


DEBS AT2 
(methylmalonyl) 


Avrll 
Nhel 
Xhol 


GACGGGqtqcgcCGGGCAGGTGTGTCGGCGTTC 
DGVRRAGVSAF 

GCCCAGTGGGAAGGCATGGCGCGGGAgttgttG 
AQWEGMARELL 

TATCCTTTCCAGGGCAAGCGGTTCTGGctgctg 
YPF QGKRFWLL 
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The sequences shown below provide the location of the KS/AT boundaries 
chosen in the FK-520 module 8 coding sequences. Regions where Avrll and Nhel sites 
were engineered are indicated by lower case and underlining. 

CCGGCGCCGTCGAACTGCTGACGTCGGCCCGGCCGTGGCCCGAGACCGACCGGccaC22 c 
5 A G AVE L L T S ARPWP E T D R P R 

GTGC'CGCCGTCTCCTCGTTCGGGGTGAGCGGCACCAACGCCCACGTCATCCTGGAGGCCG 
RAAVSS FGVSGTNAHVILEA 
GACCGGTAACGGAGACGCCCGCGGCATCGCCTTCCGGTGACCTTCCCCTGCTGGTGTCGG 
GPVTET PAASPSGDLPLLVS 
10 CACGCTCACCGGAAGCGCTCGACGAGCAGATCCGCCGACTGCGCGCCTACCTGGACACCA 
ARSPEALDEQIRRLRAYLDT 
CCCCGGACGTCGACCGGGTGGCCGTGGCACAGACGCTGGCCCGGCGCACACACTTCGCCC 
TPDVDRVAVAQTLARRTHFA 
ACCGCGCCGTGCTGCTCGGTGACACCGTCATCACCACACCCCCCGCGGACCGGCCCGACG 
15 HRAVLLGDTVITTPPADRPD 

AACTCGTCTTCGTCTACTCCGGCCAGGGCACCCAGCATCCCGCGATGGGCGAGCA£CtC2 
ELVFVYSGQGTQHPAMGEQL 
cCGCCGCCCATCCCGTGTTCGCCGACGCCTGGCATGAAGCGCTCCGCCGCCTTGACAACC 
AAAHPVFADAWHEALRRLDN 

The sequences shown below provide the location of the AT/DH boundary chosen 
in the FK-520 module 8 coding sequences. The region where an Xhol site was 
engineered is indicated by lower case and underlining. 

TCCTCGGGGCTGGGTCACGGCACGACGCGGATGTGCCCGCGTACGCGTTCCAACGGCGGC 
25 ILGAGSRHDADVPAYAFQRR 

ACTACTGG atcgag TCGGCACGCCCGGCCGCATCCGACGCGGGCCACCCCGTGCTGGGCT 
HYWI ESARPAASDAGHPVLG 

~~ The sequences shown below provide the location of the KS/AT boundaries 
30 chosen in the FK-506 module 8 coding sequences. Regions where ^vrll and Nhel sites 
were engineered are indicated by lower case and underlining. 

" "TCGGCCAGGCCGTGGCCGCGGACCGGCCGTCC2C3CCGTGCGGCGGTCTCGTCGTTCGGG 
SARPWPRTGRPRRAAVSS FG 
GTGAGCGGCACCAACGCCCACATCATCCTGGAGGCCGGACCCGACCAGGAGGAGCCGTCG 
35 VSGTNAH I ILEAGPDQEEPS 
GCAGAACCGGCCGGTGACCTCCCGCTGCTCGTGTCGGCACGGTCCCCGGAGGCACTGGAC 

AEPAGDLPLLVSARS P 'EALD 
GAGCAGATCGGGCGCCTGCGCGACTATCTCGACGCCGCCCCCGGCGTGGACCTGGCGGCC 
EQIGRLRDYLDAAPGVDLAA 
40 GTGGCGCGGACACTGGCCACGCGTACGCACTTCTCCCACCGCGGCGTACTGCTCGGTGAC 
VARTLATRTHFSHRAVLLGD 
ACCGTCATCACCGCTCCCCCCGTGGAACAGCCGGGCGAGCTCGTCTTCGTCTACTCGGGA 

TVITAPPVEQPGELVFVYSG 
CAGGGCACCCAGCATCCCGCGATGGGTGAGCGactcacCGCAGCCTTCCCCGTGTTCGCC 
45 QGTQH PAMGERLAAAFPVFA 
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GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGGATCGAGTCCGCGCCG 
DPDVPAYAFQRRPYWIESAP 

The sequences shown below provide the location of the AT/DH boundary chosen 
5 in the FK-506 module 8 coding sequences. The region where anXhol site was 
engineered is indicated by lower case and underlining. 

* GACCCGGACGTACCCGCCTACGCCTTCCAGCGGCGGCCCTACTGGatcgagTCCGCGCCG 
DPDVPAYAFQRRPYWIESAP 

10 Example 4 

Replacement of Methoxyl with Hydrogen or Methyl at C-15 of FK-506 and FK-520 
The methods and reagents of the present invention also provide novel FK-506 and 
FK-520 derivatives in which the methoxy group at C-15 is replaced by a hydrogen or 
methyl. These derivatives are produced in recombinant host cells of the invention that 

15 express recombinant PKS enzymes the produce the derivatives. These recombinant PKS 
enzymes are prepared in accordance with the methodology of Examples 1 and 2, with the 
exception that AT domain of module 7, instead of module 8, is replaced. Moreover, the 
present invention provides recombinant PKS enzymes in which the AT domains of both 
modules 7 and 8 have been changed. The table below summarizes the various compounds 

20 provided by the present invention. 



Compound 


C-13 


C-15 


Derivative Provided 


FK-506 


hydrogen 


hydrogen 


13, 15-didesmethoxy-FK-506 


FK-506 


hydrogen 


methoxy 


13-desmethoxy-FK-506 


FK-506 


hydrogen 


methyl 


1 3,1 5-didesmethoxy-l 5-methyl-FK-506 


FK-506 


methoxy 


hydrogen 


1 5-desmethoxy-FK~506 


FK-506 


methoxy 


methoxy 


Original Compound ~ FK-506 


FK-506 


methoxy 


methyl 


15-desmethoxy-l 5-methyl-FK-506 


FK-506 


methyl 


hydrogen 


13,1 5-didesmethoxy- 1 3 -methyl-FK-506 


FK-506 


methyl 


methoxy 


1 3-desmethoxy-l 3 -methyl-FK-506 


FK-506 


methyl 


methyl 


1 3,1 5-didesmethoxy-l 3,1 5-dimethyl-FK-506 


FK-520 


hydrogen 


hydrogen 


13, 1 5-didesmethoxy FK-520 
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FK-520 



FK-520 



FK-520 



FK-520 



FK-520 



FK-520 



FK-520 



FK-520 



methoxy 

methoxy 

methyl 

methyl 

methyl 



hydrogen 
hydrogen 
methoxy 



methoxy 

methyl 

hydrogen 

methoxy 

methyl 



hydrogen 

methoxy 

methyl 



13-desmethoxy FK-520 

13,1 5-didesmethoxy- 1 5-methyl-FK-520 

15-desmethoxy-FK-520 

Original Compound - FK-520 

1 5-desmethoxy- 1 5-methyl-FK-520 

13,15 -didesmethoxy- 1 3 -methy l-FK-5 20 

1 3 -desmethoxy- 1 3-methy l-FK-520 

1 3, 1 5-didesmethoxy- 1 3, 1 5-dimethy l-FK-520 



Example 5 



Replacement of Methoxyl with Ethyl at C-13 and/or C-15 of FK-506 and FK-520 
The present invention also provides novel FK-506 and FK-520 derivative 
compounds in which the methoxy groups at either or both the C-13 and C-15 positions 
are instead ethyl groups. These compounds are produced by novel PKS enzymes of the 
invention in which the AT domains of modules 8 and/or 7 are converted to ethylmalonyl 
specific AT domains by modification of the PKS gene that encodes the module. 
Ethylmalonyl specific AT domain coding sequences can be obtained from, for example, 
the FK-520 PKS genes, the niddamycin PKS genes, and the tylosin PKS genes. The 
novel PKS genes of the invention include not only those in which either or both of the 
AT domains of modules 7 and 8 have been converted to ethylmalonyl specific AT 
domains but also those in which one of the modules is converted to an ethylmalonyl 
specific AT domain and the other is converted to a malonyl specific or a methylmalonyl 
specific AT domain. 



The compounds described in Examples 1 - 4, inclusive have immunosuppressant 
activity and can be employed as immunosuppressants in a manner and in formulations 
similar to those employed for FK-506. The compounds of the invention are generally 
effective for the prevention of organ rejection in patients receiving organ transplants and 



Example 6 
Neurotrophic Compounds 
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in particular can be used for immunosuppression following orthotopic liver 
transplantation. These compounds also have pharmacokinetic properties and metabolism 
that are more advantageous for certain applications relative to those of FK-506 or FK- 
520. These compounds are also neurotrophic; however, for use as neurotrophins, it is 
5 desirable to modify the compounds to diminish or abolish their immunosuppressant 
activity. This can be readily accomplished by hydroxylating the compounds at the C-18 
position using established chemical methodology or novel FK-520 PKS genes provided 
by the present invention. 



1 0 growth that comprises administering a therapeutically effective dose of 1 8-hydroxy-FK- 
520. In another embodiment, the compound administered is a C-18,20-dihydroxy-FK-520 
derivative. In another embodiment, the compound administered is a C-13-desmethoxy 
and/or C-15-desmethoxy 18-hydroxy-FK-520 derivative. In another embodiment, the 
compound administered is a C-13-desmethoxy and/or C-15-desmethoxy 18,20- 

15 dihydroxy-FK-520 derivative. In other embodiments, the compounds are the 

corresponding analogs of FK-506. The 18-hydroxy compounds of the invention can be 
prepared chemically, as described in U.S. Patent No. 5,189,042, incorporated herein by 
reference, or by fermentation of a recombinant host cell provided by the present invention 
that expresses a recombinant PKS in which the module 5 DH domain has been deleted or 

20 rendered non-functional. 

The chemical methodology is as follows. A compound of the invention (-200 mg) 
is dissolved in 3 mL of dry methylene chloride and added to 45 |iL of 2,6-lutidine, and 
the mixture stirred at room temperature. After 10 minutes, tert-butyldimethylsilyl 
trifluoromethanesulfonate (64 |iL) is added by syringe. After 15 minutes, the reaction 

25 mixture is diluted with ethyl acetate, washed with saturated bicarbonate, washed with 
brine, and the organic phase dried over magnesium sulfate. Removal of solvent in vacuo 
and flash chromatography on silica gel (ethyl acetate: hexane (1 :2) plus 1% methanol) 
gives the protected compound, which is dissolved in 95% ethanol (2.2 mL) and to which 
is added 53 |iL of pyridine, followed by selenium dioxide (58 mg). The flask is fitted 

30 with a water condenser and heated to 70°C on a mantle. After 20 hours, the mixture is 



Thus, in one aspect, the present invention provides a method for stimulating nerve 



dc- 176500 




I PATENT 

AttyDkt: 300622002600 



- 130- 



cooled to room temperature, filtered through diatomaceous earth, and the filtrate poured 
into a saturated sodium bicarbonate solution. This is extracted with ethyl acetate, and the 
organic phase is washed with brine and dried over magnesium sulfate. The solution is 
concentrated and purified by flash chromatography on silica gel (ethyl acetate: hexane 
(1 :2) plus 1% methanol) to give the protected 18-hydroxy compound. This compound is 
dissolved in acetonitrile and treated with aqueous HF to remove the protecting groups. 
After dilution with ethyl acetate, the mixture is washed with saturated bicarbonate and 
brine, dried over magnesium sulfate, filtered, and evaporated to yield the 18-hydroxy 
compound. Thus, the present invention provides the C-18-hydroxyl derivatives of the 
compounds described in Examples 1 - 4. 

Those of skill in the art will recognize that other suitable chemical procedures can 
be used to prepare the novel 18-hydroxy compounds of the invention. See, e.g., Kawai et 
al. 9 Jan. 1993, Structure-activity profiles of macrolactam immunosuppressant FK-506 
analogues, FEES Letters 316(2): 1 07- 1 1 3, incorporated herein by reference. These 
methods can be used to prepare both the C18-[S]-OH and C18-[7?]-OH enantiomers, with 
the R enantiomer showing a somewhat lower IC50, which may be preferred in some 
applications. See Kawai et al, supra. Another preferred protocol is described in Umbreit 
and Sharpless, 1977, JACS 99(16): 1526-28, although it may be preferable to use 30 
equivalents each of Se0 2 and t-BuOOH rather than the 0.02 and 3-4 equivalents, 
respectively, described in that reference. 

All scientific and patent publications referenced herein are hereby incorporated by 
reference. The invention having now been described by way of written description and 
example, those of skill in the art will recognize that the invention can be practiced in a 
variety of embodiments, that the foregoing description and example is for purposes of 
illustration and not limitation of the following claims. 



dc- 176500 



